A Behind the Scenes Look at Mobile Ad Fraud

February 2, 2018 Chris Baker

How did I use over a gigabyte of mobile data in a single day? Why is my phone as warm as a hot plate? If you have ever asked yourself either of these questions, you might be the victim of a malicious application that is using your device and consuming your mobile bandwidth to facilitate ad fraud. We have recently identified a large population of apps being distributed from the Google Play Store that support this behavior.  These apps are installed on devices on a majority of the major cell phone carriers around the world.  These carriers operate in the US (AT&T, Verizon, Sprint, and T-Mobile), Europe (KPN, Vodafone, Ziggo, Sky, Virgin, Talk Talk, BT, O2, and T-Mobile), and the Asia Pacific region (Optus, Telstra, iinet, and others) [Note: Mobile providers and Google have been notified]. Just this morning, before this article was published, Buzzfeed broke another ad fraud story.

The Mechanics of the Grift

Online advertising consists of a complex ecosystem of ad buyers, sellers, exchanges, and data providers. Operators of websites and application authors have available space in their content layout and interaction in the user experience that can be integrated to include various forms of advertising content.  Making markets that facilitate exchange of these various units of attention and interaction is a challenge; implementing such markets, which also facilitate real-time bidding, is even harder. Within this complex service mesh are integration points that expose oversight or inaccurate assumptions.  When an advertising slot is sold, it is expected a bona fide human end user will see the ad content.  The malicious applications covered in this blog were configured to side step this expectation, and, therefore, the application is illegitimately claiming credit for an end user having viewed the ad content.

Step one in this fraud scenario: the mobile application visits a web page controlled by the fraud operator. This behavior is automated and the rate of requests increases when the phone is connected to a power source.  The server hosting the web page takes on the appearance of popular websites, including cbsnews[.]com,  aplus[.]com, wnyc[.]org, and others.  If you open one of these websites while you’re reading this, you will see each of these pages contain a number of online advertisements, including static images and video.  The organizations paying for these ads are targeting specific audiences and customer demographics.  In other words, mobile users of NPR might see very different ads from desktop visitors to NASCAR[.]com. This feature of the online ecosystem leads to companies bidding and paying more for their ads to hit their target audience.

Once you consider that the website being visited affects the choice of ad target consumer, it’s no surprise that bad actors might seek to attempt to trick advertising platforms into giving them access to higher value ads.  The ad fraud operator creates a pretense that they are selling advertising on billboard space in Times Square when really, they are selling space inside the warehouse at the end of Indiana Jones.  The ad that someone is paying to have shown to a visitor of wnyc[.]org will be downloaded and viewed, or clicked on, by a process running on the device without any end user being involved.  The ad contract will be considered complete. Money will change hands because the advertisement was delivered with the requisite pixels and cookies tracked.

The trick to this fraud relies on setting the Referer [sic] field in the HTTP request-header.  The Referer [sic] field (https://tools.ietf.org/html/rfc2616 14.36 ) informs the site you’re visiting which site you came from, or were referred from based on what the field implies.  Borrowing from the above examples, the goal is to have the Referer [sic] field claim the visitor is coming from a popular website like, wnyc[.]org.  This is accomplished by configuring a webserver, controlled by the fraudster, with a Host header that contains the website they are impersonating.  This is as easy as going into the webserver configuration file, in this case NGINX, and setting the server_name variable to “wnyc.org

server {
    listen       80;
    server_name  wnyc.org  www.wnyc.org;
    ...
}

That’s it.  The malicious mobile application sends a GET request to the fraudster’s webserver and loads specially crafted HTML and JavaScript engineered to start the scam.  The web page contains all the required logic and instructions for the application to execute the scam. The HTML landing page that the app visits is a mashup containing components of a legitimate web page with some modified tags and added JavaScript.  As the page is loaded, the mobile device runs the downloaded JavaScript and follows the embedded instructions.  It follows links and passes along the referrers from the fraudster’s server (example below).  With these referrers in place, the pretense of an interaction with the advertising platform is established.

Origin: http://wnyc[.]org

Referer: http://www.wnyc[.]org/

This web server then becomes a Referer [sic] forging machine. The referrer is the ticket in the door; it qualifies the endpoint to receive higher quality ads, but the end point still needs to fulfill on the terms of the contract. The next event is a request made to tagmoxie[.]com; it passes along the Origin and Referer [sic] headers, as well as the X-Requested-With header that contains the name of the system and name of the application. The URI contains variables, which define the pixel height and width, as well as the referring domain, and a variable titled ‘cb’ most likely used for cache busting or as a form of unique token. The structure of the path of the GET request includes a /tag/ subdirectory and also the supply side ID related to the transaction. This number is later a variable defined as supplytag.

The response to this GET request is an eXtensible markup language (XML) object, called a Video Ad Serving Template (VAST), which defines the ad content. The VAST template provides a standard structure for exchanging the metadata required for serving video ads. It also contains details about who is providing the ad, its duration, unique identifier, etc. Additionally the object defines the path to access the associated media (in this case a swf file), as well as a series of tracking events. These tracking events are part of the Video Player Ad-Serving Interface Definition (VPAID). VPAID provides a standard interface for integration between the ad content and the video player. This integration provides granular metrics and reporting about the ad viewing experience. For example, as the ad is loaded, the impression needs to be tracked by interacting with an API ( /api/events? ). This ensures that details of the country code, template, etc., are associated and passed along.

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<VAST version="2.0">
  <Ad>
    <InLine>
      <AdSystem>Tagcade</AdSystem>
      <AdTitle>VPAID Client</AdTitle>
      <Error><![CDATA[http://tagmoxie.com/api/events?event=error&demand_tag=<REDACTED>&supply_tag=<REDACTED>&trid=0&url=wnyc.org&ip=<REDACTED>]]></Error>
      <Error><![CDATA[http://log.tagcade.com/log?type=vast&event=error&error=[ERRORCODE]&tagId=<REDACTED>&playerCb=[CACHEBUSTING]&requestId=<REDACTED>&country=<REDACTED>&cb=<REDACTED>]]></Error>
      <Impression><![CDATA[http://tagmoxie.com/api/events?event=impression&demand_tag=<REDACTED>&supply_tag=<REDACTED>&trid=0&url=wnyc.org&ip=<REDACTED>&identity=<REDACTED>&cb=<REDACTED>]]></Impression>
      <Impression><![CDATA[http://log.tagcade.com/log?type=vast&event=impression&tagId=<REDACTED>&playerCb=[CACHEBUSTING]&requestId=<REDACTED>&country=<REDACTED>&cb=<REDACTED>]]></Impression>
      <Creatives>
        <Creative sequence="1">
          <Linear>
            <Duration>00:00:30</Duration>
            <AdParameters>{"requestId":"<REDACTED>","secure":false,"country":"<REDACTED>","domain":"wnyc.org","player_size":"medium","tagId":"<REDACTED>","waterfallId":<REDACTED>,"waterfall":"<REDACTED>","ivtPixelConfigs":"W10="}</AdParameters>
            <MediaFiles>
              <MediaFile delivery="progressive" height="640" width="480" apiFramework="VPAID" type="application/x-shockwave-flash">http://cdn.pubvantage.com/vpaid-client-1.0.16.swf</MediaFile>
              <MediaFile delivery="progressive" height="640" width="480" apiFramework="VPAID" type="application/javascript">http://cdn.pubvantage.com/vpaid-client-1.0.17.js</MediaFile>
            </MediaFiles>
            <TrackingEvents>
              <Tracking event="start"><![CDATA[http://tagmoxie.com/api/events?event=start&demand_tag=<REDACTED>&supply_tag=<REDACTED>&trid=0&url=wnyc.org&ip=<REDACTED>]]></Tracking>
              <Tracking event="firstQuartile"><![CDATA[http://tagmoxie.com/api/events?event=firstQuartile&demand_tag=<REDACTED>&supply_tag=<REDACTED>&trid=0&url=wnyc.org&ip=<REDACTED>]]></Tracking>
              <Tracking event="midpoint"><![CDATA[http://tagmoxie.com/api/events?event=midpoint&demand_tag=<REDACTED>&supply_tag=<REDACTED>&trid=0&url=wnyc.org&ip=<REDACTED>]]></Tracking>
              <Tracking event="thirdQuartile"><![CDATA[http://tagmoxie.com/api/events?event=thirdQuartile&demand_tag=<REDACTED>&supply_tag=<REDACTED>&trid=0&url=wnyc.org&ip=<REDACTED>]]></Tracking>
              <Tracking event="complete"><![CDATA[http://tagmoxie.com/api/events?event=complete&demand_tag=<REDACTED>&supply_tag=<REDACTED>&trid=0&url=wnyc.org&ip=<REDACTED>]]></Tracking>
              <Tracking event="close"><![CDATA[http://tagmoxie.com/api/events?event=close&demand_tag=<REDACTED>&supply_tag=<REDACTED>&trid=0&url=wnyc.org&ip=<REDACTED>]]></Tracking>
              <Tracking event="pause"><![CDATA[http://tagmoxie.com/api/events?event=pause&demand_tag=<REDACTED>&supply_tag=<REDACTED>&trid=0&url=wnyc.org&ip=<REDACTED>]]></Tracking>
              <Tracking event="resume"><![CDATA[http://tagmoxie.com/api/events?event=resume&demand_tag=<REDACTED>&supply_tag=<REDACTED>&trid=0&url=wnyc.org&ip=<REDACTED>]]></Tracking>
              <Tracking event="acceptInvitationLinear"><![CDATA[http://tagmoxie.com/api/events?event=acceptInvitationLinear&demand_tag=<REDACTED>&supply_tag=<REDACTED>&trid=0&url=wnyc.org&ip=<REDACTED>]]></Tracking>
              <Tracking event="timeSpentViewin"><![CDATA[http://tagmoxie.com/api/events?event=timeSpentViewin&demand_tag=<REDACTED>&supply_tag=<REDACTED>&trid=0&url=wnyc.org&ip=<REDACTED>]]></Tracking>
              <Tracking event="otherAdInteraction"><![CDATA[http://tagmoxie.com/api/events?event=otherAdInteraction&demand_tag=<REDACTED>&supply_tag=<REDACTED>&trid=0&url=wnyc.org&ip=<REDACTED>]]></Tracking>
              <Tracking event="creativeView"><![CDATA[http://tagmoxie.com/api/events?event=creativeView&demand_tag=<REDACTED>&supply_tag=<REDACTED>&trid=0&url=wnyc.org&ip=<REDACTED>]]></Tracking>
              <Tracking event="mute"><![CDATA[http://tagmoxie.com/api/events?event=mute&demand_tag=<REDACTED>&supply_tag=<REDACTED>&trid=0&url=wnyc.org&ip=<REDACTED>]]></Tracking>
              <Tracking event="unmute"><![CDATA[http://tagmoxie.com/api/events?event=unmute&demand_tag=<REDACTED>&supply_tag=<REDACTED>&trid=0&url=wnyc.org&ip=<REDACTED>]]></Tracking>
              <Tracking event="fullscreen"><![CDATA[http://tagmoxie.com/api/events?event=fullscreen&demand_tag=<REDACTED>&supply_tag=<REDACTED>&trid=0&url=wnyc.org&ip=<REDACTED>]]></Tracking>
            </TrackingEvents>
            <VideoClicks>
              <ClickTracking><![CDATA[http://tagmoxie.com/api/events?event=click&demand_tag=<REDACTED>&supply_tag=<REDACTED>&trid=0&url=wnyc.org&ip=<REDACTED>]]></ClickTracking>
            </VideoClicks>
          </Linear>
        </Creative>
      </Creatives>
    </InLine>
  </Ad>
</VAST>

We know, as the owner of the device, that we did not initiate the GET request to the referrer forgings server. We also know that this GET request has nothing to do with the service delivery of the installed mobile application. In this initial interaction, the mobile application passes a unique identifier and the application name, so there is sufficient data to attribute the traffic back to the mobile device. It could be that the traffic from the device itself is the product of the malicious app and the operator of the webserver is purchasing the traffic in order to execute their own independent scheme. The server responsible for configuring the Referer [sic] cycles through different referring sites to avoid suspicion on its own, e.g., after using wnyc[.]org, the scheme rotates to use cbsnews[.]com, Univision[.]com, sciencechannel[.]com, etc. From this point in the workflow, we can track the supply_tag and demand_tag variables being used and look for clusters of who is involved. From there, we would try to establish where the guilt falls. At a minimum, this illustrates the complexity of the market.

The issue remains that a nefarious actor has control over a device with an IP address associated with a mobile provider. The device is visiting a webserver configured to forge a Referer [sic] for a premium web property. The webserver takes on the appearance of the premium web property and instructs the device to display the ads. A follow up post is in the works which seeks to unravel the knot of who is profiting from this on the ad tech side and how. A special thanks to Google for taking down the malicious apps, the Moat team for their insight, the telecom companies, and WNYC for all of the fine NPR content and secure web defaults.

Interim Prevention and Mitigation

As an end user, there are several basic actions you can take to prevent your device and bandwidth from being co-opted in one of these schemes. Do some due diligence before installing applications on your devices; if the application is being distributed from an official source like the Google PlayStore, you should still check the application’s developer and its device access. Consulting past reviews can also inform the end user if the application might be malicious.  A large number of the malicious applications we studied had a distribution of reviews that skewed toward “bad” with comments complaining about the applications crashing or behaving poorly.

Read more...

About the Author

Chris Baker

Chris is a Principal Data Analyst Dyn, a cloud-based Internet Performance company that helps companies monitor, control, and optimize online infrastructure for an exceptional end-user experience. Follow Dyn on Twitter: @Dyn.

More Content by Chris Baker
Previous Article
Internet Intelligence, Now Available In The Oracle Cloud Infrastructure Console
Internet Intelligence, Now Available In The Oracle Cloud Infrastructure Console

The Oracle Cloud Infrastructure (OCI) team is proud of the data centers and network we are building for the...

Next Article
China Activates Historic Himalayan Link To Nepal
China Activates Historic Himalayan Link To Nepal

On 10 January 2018, China Telecom activated a long-awaited terrestrial link to the landlocked country of Ne...