If you need yet another number to understand just how thoroughly Facebook dominates the (social) Web, Embedly has one for you: 42 percent of all URLs that the service processes have one or more Open Graph tags. Give it a few more months, and it won't be long before every other URL features Open Graph tags.
The data comes directly from the horse's mouth: Embedly co-founder and CEO Sean Creeley. The number came from, like most statistics, a simple question (actually, there were three). Over on Quora, someone posted this: "How has the Facebook Open Graph affected embed.ly's scraping systems? Is it generally easier to gather data now? What percentage of scraped links include Open Graph data?"
Creeley responded by saying:
Yes. Open Graph makes it easier to gather data, but it's not always the data you want. While Open Graph has a ton of different facets, we generally stick to image, video, audio and description so I'll talk to those.
You need to look at it from a publisher's side rather than a scraping side. The content placed in those tags are specifically designed to be used in Facebook. The images are generally 90px wide, the video generally is optimized for viewing in a 400px embed and the description is short. It's perfect for Facebook's use of the data, but for general purpose it's very limiting.
The web is getting bigger; images, videos and descriptions. Could Flipboard or Pinterest solely use Open Graph? No, not really.
Embedding is also about user expectation. If there is a huge image in the middle of the page, yet the Open Graph tag is a default gif, it's not a great experience.
Embedly uses Open Graph as a fall back. If we can't find anything else in the page that is better, we will default back to Open Graph.
At first, Creeley didn't have a number on hand for the last question, but he decided to figure it out. After adding a few variables to the startup's Statsd/Graphite setup, Creeley came up with the graph pictured above. The purple region represents links Embedly has crawled that provide Open Graph metadata as a percentage of all links.
It's of course worth noting that Embedly's crawler doesn't go out looking for URLs, meaning it only process URLs that have been shared through its API. As such, one could argue that the company's Open Graph average is actually higher as the sites that are shared more are optimized to be shared on Facebook.
Still, the above graph was generated over the last 36 hours, which comes out to a sample size of 12 million URLs. Furthermore, Creeley points out that Google Trends shows Open Graph is trending up, so we'll likely see even more URLs with Open Graph tags in the near future.