In 2018, working in the space of programmatic advertising, I became interested in the connections between sites that can be discovered by analysing their ads.txt files and among other attributes. From this, I created the site DigitalAd.Tech, to find websites that should be considered for addition to existing blocklists in programmatic buying systems. While built as a hobby and outside of work, this project did directly increase my efficiency reviewing sites to be blocked buying programmatic advertising and while it is difficult to quantify I would like to hope the media performance for the clients where these blocklist improvements where applied.
Primarily built over the course of a few nights in late November 2018 the DigitalAd.Tech system was able to find connections between sites and understand the networks that ad resellers had using:
- Ads.Txt file entries.
- App-Ads.Txt file entries.
- Server IP Addresses.
- Domain Whois information.
- DNS records, particularly txt verification entries.
- Social sharing meta information.
- Google Analytics tracking code.
- Google Tag Manager Code.
- Facebook Tracking code.
- Various other tracking code implementations with unique identifiers.
Individually a number of these attributes would not be a significant indication of a connection; together they aided in identifying thousands of sites that had strong relationships to sites that had already been placed on our programmatic blocklists.
After a year of running this project, and scanning hundreds or thousands if not millions of sites (I wish I had backed up the databases before deleting them, so I knew this number!), and learning about the inconsistencies of the web, the datasets had grown to the point that my personal server could no longer respond in a reasonable time to requests. With the application requiring rearchitecting and my personal attention turned elsewhere, I decided that it was time to discontinue this project.
Programming Languages: Python, HTML, CSS Framework: Django, Bootstrap Technologies: Celery, RabitMq, Sendgrid, SQLite, Postgres