Data Liberation: Meet WordPress.org’s Ambitious Plan for 2024

“Imagine a more open web where people can switch between any platform of their choosing. A web where being locked into a system is a thing of the past. This is the web I’ve always wanted to see.”

At the end of his State of the Word 2023 keynote, Matt Mullenweg pulled a Steve Jobs move, revealing Data Liberation, an initiative he described as the thing he’s most excited about.

“In 2024, we want to unlock the web through a dedicated focus on migration tools. We want to make first-party community plugins, tools, and workflows available on WordPress.org […] I want it to be seamless, straightforward, and as zero friction as possible.”

Mullenweg promised that WordPress.org would take these projects under its wings, provide each with a dedicated Slack channel and GitHub repository on its official platforms, and ensure a swift review process of about one business day.

In a post published shortly after the event, Mullenweg laid an even more ambitious plan:

“Migrating your site to WordPress, or exporting all your content from WordPress, should be possible in one click. I want WordPress’ export format to become the lingua franca of CMSes, whether coming to WordPress or moving within WordPress.

[…] it should be more than plugins; workflows, tutorials, and helper scripts should be shared, too. I want this resource to have space to include moving from social networks, moving from a page builder to core blocks, switching from classic to blocks, and improving WordPress current canonical plugins for importing.”

We’ve Only Just Begun

There’s a lot of work: the guides on the Data Liberation website are mostly outlines, waiting for contributors to materialize the team’s good intentions. The recently updated Importing Content guide on the WordPress Developer Resources website includes obsolete services and links.

Navigate to Tools > Import, and you’ll get a similar blast from the past looking at the importing tools on offer: Blogger, LiveJournal, Movable Type and TypePad. If you have a website hosted on WordPress.com, you’ll get support for a handful of contemporary platforms, including Medium, Squarespace, Substack, and Wix.

Contributors made a little spring clean in 2021, moving the code of these and other somewhat relevant importers from SVN to GitHub, where it’s easier to maintain. But that’s barely scratching the surface.

What’s needed now is traction.

Calling All Hosting Companies

In a post titled Data Liberation Next Steps, project leader Jordan Gillman introduced a few high-level paths forward:

There are still no concrete timelines and not everybody in the community is thrilled, but we decided to reach out to Gillman to get some more answers.

What’s the current status of the project?
“There’s been a conversation about how to approach data liberation across teams. I’m hoping to bring those conversations together to brainstorm and more concretely plan what the project actually looks like.

Broadly speaking, the response has been really positive. However, getting buy-in from plugin developers and other stakeholders will continue to be a priority for this project. The first initial challenge is one of coordination; this is going to be a collaborative project (by design), which means drawing folks and companies together to define the path forward.

I’ll be at WordCamp Asia next week, and look forward to chatting with folks in person!”

Are there plans to collaborate with other online platforms (open source or commercial) to promote this initiative and make it a reality? I imagine a utopian scenario where WordPress leadership sits down with major players in the industry to create a standard—something along the lines of the efforts made in the case of the EU’s Cyber Resilience Act.
“Data Liberation could be an opportunity to lead by example. And since WordPress is doing this publicly, anyone can share updates about this work. That really means that how we go about this project today needs to be with that broader picture in mind.”

Have you considered starting with a universal content migration solution instead of the more ambitious data objective?
“At this stage, the focus is on addressing the goal of migrating sites to WordPress or exporting all content from WordPress in one click, thus simplifying the process of transferring content between platforms.

In terms of ‘universal’ content migration, I think conversations about the WXR Import/Export format are really valuable. I’m also excited about the potential of WordPress Playground, which uses a ZIP file for migrations.

It’s important to note that this is an ongoing project and that these efforts don’t exclude bigger goals related to ‘data.’ They provide a tangible starting point, with small, achievable phases to actively involve contributors in progressing toward our broader goals.”

You ended the post with questions. How would you answer the last two?

  • With such a broad scope, what should be focused on with the highest priority?
    “One thing that I’m particularly excited about is pulling the disparate plugins, guides, and tools and unifying them in an interface to simplify the experience for users in the migration process. A user shouldn’t have to go searching for the right tool (or combination of tools); we should be able to direct them to the appropriate solution, and walk them through the process!”
  • What other voices should be heard on this topic?
    “I hope that this question prompts voices from individuals and groups that I haven’t already thought of. I’m also keen to hear from web hosts. Given their daily management of migrations for customers, I believe they have insights into pain points, popular platforms (and how they could help), and valuable tools and workarounds that could benefit the wider community.”

Radical Ethos

Manton Reece, the creator of Micro.blog, recently wrote about why and how he uses Blog Archive format (.bar), a format he developed in 2017, hoping it would become universal. Apparently, Dave Winer, co-author of the RSS format, had similar ideas.

During the keynote, Mullenweg made clear that more than a technical vision, Data Liberation is an ethical commitment. “Data Liberation is not just about building the tools, it’s about cultivating a community ethos.”

Josepha Haden Chomphosy, the Executive Director of the WordPress project, expressed a similar sentiment in her post-event letter:

“There’s an extent to which the idea of owning your content and data online is a radical idea. Securing an open web for the future is, I believe, a net win for the world especially when contrasted to the walled gardens and proprietary systems that pit us all against one another with the purpose of gaining more data to sell.”

Their words echo core contributor Gary Pendergast’s moral obligation to web users from 2021. In a 4-part series, Pendergast draws a neat circle around the open web advocates’ shared visions: he explores the history of a category of tools he calls WordPress Importers, outlines the problems, and eventually shares a solution: An experimental browser extension that exports data from Wix using its API.

“Some services […] don’t allow you to export your own content. Other services provide incomplete or fragmented exports, needlessly forcing stress upon site owners […]

When a CMS actively works against providing such freedom to their community, I would argue that we have an obligation to help that community out.

WordPress doesn’t exist in a vacuum, we’re part of a broad ecosystem which can only exist through the web remaining open and free. By encouraging all CMSes to provide proper exports, and implementing them for those that don’t, we help keep our ecosystem healthy.”

Between rescuing users from vendor lock-in and fostering a healthy ecosystem, WordPress.org is in a perfect position to take the lead on such an undertaking and support, fund, and promote it.

Software Can Do that

With the philosophical problems out of the way and the potential to solve the commercial ones on the horizon, we’re left with the technical issues.

The primary pain point of WXR is the lack of media support, but once you start considering newer tech stacks, this suddenly seems like the least of everyone’s problems.

Let’s present these players:

  • W3Tech places WordPress (43.2%), Shopify (4.3%), and Wix (2.6%) as the top three most used CMSs.
  • BuiltWith’s dataset shows that Webflow is number 15 on the list of CMS used across the top 1 million sites crawled, and Contentful is number 26.
  • WebTechSurvey’s analysis reports that 42,900 of the 31,758,591 websites that use a CMS opt for headless CMS, with Ghost leading the market (41.05%).

Buzzing with hype, these tools that many—not just younger—developers are touting are almost missing from the project’s repository. So far, it hasn’t attracted interest from people working with headless CMSs like Ghost, Sanity, or Prismic.

Trendy site builders, like Webflow or Storyblok, steadily gaining traction among product-oriented teams, are also absent. Less restrictive than their predecessors, they share their mean inclination to make it harder to move away.

Migrating to and from these new platforms presents new challenges, including mapping custom schemas and supporting discrete file types (RSS/XML, JSON, CSV, markdown, HTML).

None of that is impossible to resolve; that’s what software is for. But there is one more obstacle: the knowledge gap between those who possess a deep understanding of Gutenberg, its architecture, and design principles, and all the others, including some core contributors and plugin developers.

In an interesting discussion on the project’s Slack channel, Automattic’s Software Design Engineer, Dennis Snell, demonstrated how invaluable it is to keep this initiative as collaborative and wide-reaching as possible.

Thank You For Choosing WordPress

These are still early days, but the Data Liberation initiative is a rare political act, and how the platform’s leadership frames it is inspiring.

When billion-dollar corporations gobble up as much data as they can store on their servers, it’s almost defiant to invest time and money in making it easier for people to do with their data—their business, their thoughts—as they wish.

Information, generated by these bits of data and serving as the foundation of knowledge, doesn’t want to be free. It’s us, humans, who want that. Let’s do something about it.

Leave a Reply

Your email address will not be published. Required fields are marked *

Leave a comment

Your email address will not be published. Required fields are marked *