en youâre a massive international news organization, revamping your flagship platform is only part of the story. Adapting and consolidating operations to stay relevant in an ever-changing media landscape is the order of the day.
A key part to that story is obviously understanding what it is your readers like, what they actually read and what they really should know about. The Guardian, to help it achieve those goals, has built its own in-house analytics engine called Ophan. Itâs a bit like Chartbeat or Parsely, or many other analytics platforms at its core.
Behind the scenes, itÂ publishes about a quarter of a billion events per day and typically the lag before something shows up on the dashboard is somewhere between three to five seconds. But it hasnât always been such a powerhouse.
The project grew out of a hack day, but the team couldnât bring themselves to turn it off at the end of the weekend. So they just left it on, running in the background. Three months later, The Guardianâs director of architecture, Graham Tackley, decided to devote some real time and attention to what it could achieve. Tackley and Chris Moran, The Guardianâs digital audience editor, quickly saw that by measuring site data more closely, the organization as a whole could potentially benefit.
âItâs like understanding how journalism works digitally [â¦] the ideal is that we should all understand that. And that eventually kind of codified into the idea of democratizing that data within The Guardian. All of those sub-editors that were working without guidance, essentially.Â We wanted them to be able to see it for themselves,â Moran said.
When it was ready, they opened it up to the whole company, to anyone that could potentially benefit from additional data insight. Since switching it on, Ophan has gone from zero to more than 950 monthly active users within the organization, Moran exlained.
The key thing about that organic growth [is that] you donât get 950 people [to use a tool like Ophan] by buying in a toolÂ andÂ going through some massive training process,â Moran said. âThe way you get that is by making the tool useful to theÂ individuals within this building, so everything it has grown into has been as a direct result of within editorial and beyond â but very much focused around editorial and people saying âIâve got this problem, could Ophan help withÂ this?’â
The answer to âwhy do we need Ophan?â isnât âbecause data,â itâs because it can help us do our editorial jobs, even if thatâs just as simple as showing some section editors that they might be wasting their time micro-managing the front page and should instead be focusing on an article level on how to move people along to other pieces of content. That in itself helps them prioritize their time.
A frequently cited concern by editorialÂ when talking about data is the fear that if you only pander to readers and a wider audience, you end up withÂ content that falls somewhere between asinine and anodyne. However, perhaps unsurprisingly, Moran disagrees.
Page views are much maligned, people say why use page views because it just leads to clickbait â but actually, if what youâre trying to do is judgeÂ how well youâve promoted a piece of content, itâs really effective. Itâs not necessarily a sign of quality â as long as we all understand that, thatâs fine â but it can tell you whether something is working.
The thought here is that if âpage viewsâ is a useful metric in its own right, it somewhat removes it from the advertising process â clicks for feedback, rather than to drive revenue. Moran explains that while thereâs noÂ commercial aspect to his job at all. âWe basically work on the principal that if we get our excellent journalism in front of the widest possible audience for each piece, thatâs probably going to have a good commercial benefit â and surely thatâs also an editorial aim?â
Ophan isnât just tracking page views. Itâs also looking at things like median attention time, using the same sort of methodology as Upworthy or Chartbeat, to find out where users are engaging with each page. This goes some way to arguing against pure page view measurements, as the team can see whether theyâre just clicks or whether people are actually engaging with the article. And what does âengagementâ consist of here?
âThe way it works is that there has to be evidence youâre actively doing something on the page â it has to be in the foreground tab, and you have to be moving the mouse or scrolling, or clicking, or doing something like that,â Tackley said. âEvery time you do that, the timer starts for five seconds.â
TheÂ median attention time detail of Ophan is split in terms of dispersal. Itâs in 10-second chunks, and visitors are color-coded depending on how long they stayed on a page. People who left before 10 seconds are widely classed as âbounceâ. This dataÂ can give an indication of how any particular piece of content is resonating with theÂ audience, but different formats perform differently.Â A live blog behaves very differently than a regular articleÂ because people are refreshing it, so the attention time tends to get dragged down. They also tend to be pretty long in comparison to regular written content.
By looking at these figures, section heads, editors and journalists can see, broadly speaking, how long any articleÂ should take to read, how many people went beyond that and what the median attention time was.
What this doesnât mean, however, is that thereâs a magic formula for working out optimum article length, Moran says.
One of the reasons I worry about attention time as a metric around editorial content â particularly as an indicator of quality â because thatâs a really tempting thing to do, to go âright, if we aggregate this data, we can find out the peak length of an article,â but I just think thatâs nuts. If people worry about page views as a metric leading to clickbait, attention time leading to you curtailing or lengthening your articles, I think is much, much more dangerous when you talk about editorial content.
Thereâs obviously a lot of buzz around attention time and various other things â at a macro level, it makes sense that if you have a lot of quality journalism, youâd expect people to spend more [time] on your site. But itâs also a very slippery metricâ¦weâre learning what you canât do at an article level is assume that time spent on page is a clear indicationÂ of quality, because itâs affected by so many other things.
TheÂ âRussell Brandâ effect
For anyone that works in publishing and has any understanding of the internet and social, itâs going to come as little surprise to hear that just dumping a URL onto the internet and hoping people find it isnât a very good strategy. It needs promotion, and much of that is via social channels like Twitter and Facebook, as well as Google Search.
As such, Ophan has a whole bunch of icons and shortcuts indicating exactly where any one piece of content has already been promoted. If something hasnât been, journalists and editors are encouraged to hassle Moran and his team.
âYou canât actually identify by subject what does badly, broadly speaking,â he explained. âThe one thing those pieces that do badly have in common is that we havenât promoted them. When youâre doing 500 pieces a day, itâs really easy for bits to slip through the cracks. If youâre the journalist whoâs interested in it, or the sub-editor, or the editor, you can immediately get a sense of whether or not weâve pushed this piece of content.â
One thing that doesnât work, however, is teasers on social media. Moran has a serious distaste for social media âgurusâ who advise the use of them.
âItâs absolute horse shit. Every single time you use a teaser on social media is that you mightÂ get high click through, but itâs very unlikely as people donât have the time, but what you definitely get is immediate bounce,â he said.
âThe problem is people see Russell Brand going âI done thisâ or âthis is interestingâ, and they all go âit got 500,000 retweetsâ and everyone gets very excited, but Russell himself doesnât know how many people bounced off that fucking article. Also Russell Brand is Russell Brand, heâs not a newspaper,â he added.
Ophan provides the team with a level of granularity that turns it from analytics reporting into an exploratory tool. A nice example of this is a piece around the Nigerian elections it ran a couple of weeks ago. By looking at the data, the team could see that 35 percent of the total views came from within Nigeria, which is validating for the journalist and editor working on the piece.
More than that though, it also threw up an interesting data set from an unknown device type â it turned out to be a mobile browser that peculiarly specific to Nigeria.
âFundamentally, if weâre looking at expanding into other territories, that kind of information is really, really useful to us,â Moran said.
That kind of data exploration helps the team understand the relationship between all its different channels too, including social and search.
âUnknown [traffic] really interests me because thereâs a lot of rubbish about it on Twitter. People, really, really intelligent, well-qualified, respected people talking only about dark social â because itâs a wacky term,â Moran said.Â âThe simple fact is: social is clearly important in it and Facebook definitely, a bitâ¦Â but search is a huge part of this, and anyone whoâs not talking about search when theyâre talking about unknown doesnât know what theyâre talking about.â
An assumption too far
While based around data and analysis, Ophan is essentially designed to be an insight machine, in the right hands. Perhaps controversially though, this means that Moran thinks that it should be used to inform editorial judgement and what gets produced. Obviously, some people worry that this could lead to impaired judgement.
The interesting thing about that is that it assumes a couple of things.Â The firstÂ is that operating in total ignorance is good and will always lead to the right decision; I know personally just from watching the data versus that process that this is very much not the case.
Secondly, it assumes that the data is in control of you, youâre not in control of the data. The point of this is, and the reason itâs not automating into all of our various processes, is that we want human beings between the data and editorial. We want to be data-informed, not data-led. There are times when the data will tell us something, or confirm something we might already know and we might very well ignore it.
He continues, arguing that being informed by the data can actually lead to higher quality journalism. Writing about Justin Bieber isnât the recipe for success you might think it to be.
One of the reasons behind that, if you look at the data, is that youâre probably talking about a lot of Google traffic, if youâre thinking about scale, and Justin Bieber is one of the most competitive search terms in the world, with a lot of places like TMZ who have much, much biggerÂ domain authority around it. So itâs quite useful just to burst myths like that.
âWeâve consistently shown on Facebook that our core journalism around social ills, politics and everything else works really nicely on Facebook, on a mass platform,â Ophanâs architect Tackley, added. âThatâs real journalism, real articles, not listicles.Â If we just trusted those lazy myths, weâd never try pushing that kind of contentÂ out.â
Part of the battle seems to be to understand what the data is actually telling you; there are a whole lot of different metrics you could look at, but if you donât fully understand what youâre seeing, itâs easy to miss the real implications.
For example,Â page views per visit are often measured to give an indication of loyalty, but if you improve your article page so that you link better, your page views per visit should rightly go down; people find things quicker because they donât have to navigate to a different section to find something else to read.
The future of news?
Loyalty is something pretty high on the agenda for Tackley and Moran. Right now, Ophanâs not really very good at measuring it.
âThereâs some vague approximation to it in there, but we need to do a better job of understanding whether this is someone whoâs coming back to this article or series every week, or is it someone whoâs just coming in for the first time etc. Having some indication of that in an understandable way that you can action [would have value]. I donât know how to do that, but weâll figure it out,â Tackley said.
One of the other big things that the pair are excited about is tracking video, both on-site and off-site.
âThe offsite is really critical for the future,â Moran said. âWe know that natively we can embed a video in Facebook and the reach will be huge compared to, say, on-site. Everybody knows this, but how do you compare that success to something like a piece that gets 10 times fewer plays on-site but that carries a pre-run, for example? Thatâs really difficult.â
With humble beginnings at a hack day, and now 950 monthly users within the organization and the power to inform theÂ international news operation, Ophan has clearly demonstrated that the future of The Guardian isÂ a data-driven one.
Data-informed. I mean, data-informed.
With that comes an editorial responsibility to resist the temptation to cover topics in ever decreasing circles. Indeed, if Tackley and Moran are correct, Ophanâs value comes in allowing exploratory promotion of content, as well as the potential to experiment with the editorial agenda.
Originally posted via “How The Guardianâs Ophan analytics engine helps editors make better decisions”