In 2018, working in the space of programmatic advertising, I became interested in the connections between sites that can be discovered by analysing their ads.txt files and among other attributes. From this, I created the site DigitalAd.Tech, to find websites that should be considered for addition to existing blocklists in programmatic buying systems. While built as a hobby and outside of work, this project did directly increase my efficiency reviewing sites to be blocked buying programmatic advertising and while it is difficult to quantify I would like to hope the media performance for the clients where these blocklist improvements where applied.

Primarily built over the course of a few nights in late November 2018 the DigitalAd.Tech system was able to find connections between sites and understand the networks that ad resellers had using:

  • Ads.Txt file entries.
  • App-Ads.Txt file entries.
  • Server IP Addresses.
  • Domain Whois information.
  • DNS records, particularly txt verification entries.
  • Social sharing meta information.
  • Google Analytics tracking code.
  • Google Tag Manager Code.
  • Facebook Tracking code.
  • Various other tracking code implementations with unique identifiers.

Individually a number of these attributes would not be a significant indication of a connection; together they aided in identifying thousands of sites that had strong relationships to sites that had already been placed on our programmatic blocklists.

After a year of running this project, and scanning hundreds or thousands if not millions of sites (I wish I had backed up the databases before deleting them, so I knew this number!), and learning about the inconsistencies of the web, the datasets had grown to the point that my personal server could no longer respond in a reasonable time to requests. With the application requiring rearchitecting and my personal attention turned elsewhere, I decided that it was time to discontinue this project.

Programming Languages: Python, HTML, CSS Framework: Django, Bootstrap Technologies: Celery, RabitMq, Sendgrid, SQLite, Postgres