It started easy: you just ‘monkey patch’ APIs to insert logging before/after invocation! But then corner cases started emerging.
Some important DOM properties (for example,
toString and rich stack traces on exceptions.
The alternative was to bake our instrumentation into Chromium’s C++ source code. This has been done before, but the traditional drawback has been maintainability: Browsers are updated constantly, and few researchers can prioritize keeping their patches current.
To save VisibleV8 from such a fate, we pursued a strategy of radical minimalism, confining our patches strictly to V8 rather than the whole browser and hooking a handful of ‘choke-points’ within V8 to produce all-or-nothing instrumentation. The final patches made few invasive changes and added less than 600 new source lines of code (SLOC). For comparison: Chromium as a whole comprises millions of SLOC.
Native function calls were straightforward to instrument, as V8 channels all such calls through a single gateway function.
VisibleV8 allowed us to discover new artefacts we had no prior knowledge of
Of course, hooking all property accesses is expensive, even under JIT. We measured a ~60% slowdown on the Speedometer full-browser benchmark, and a few of Dromaeo’s aggregated microbenchmarks were much worse. However, we observed that VisibleV8 outperformed equivalent in-band instrumentation wherever such comparisons were possible.
We visited the Alexa top 50k web sites using VisibleV8 to look for evidence of crawling countermeasures. Specifically, we looked for code probing properties that do not exist in proper browsers but which are artefacts of headless/automated browsers (that is, bots).
We found bot detection activity on 29% of the visited domains (over 73% of it coming from 3rd-party iframes).
VisibleV8 is freely available, and under active development and maintenance (up to Chrome 79 as of writing). We hope it will serve as a foundation for other researchers’ tools providing deep insight into dynamic behaviour on the Web.
Watch: Jordan Jueckstock present on Visible V8 at IMC 2019.
The views expressed by the authors of this blog are their own and do not necessarily reflect the views of APNIC. Please note a Code of Conduct applies to this blog.