You are here: Home Blog Plone powers 50GB of environmental data, maps and figures

Plone powers 50GB of environmental data, maps and figures

by Sasha Vinčić Nov 30, 2009 08:59 AM
Categories
Plone is the open source CMS used most among government and EU organizations. One of them is EEA which recently migrated their data service from a custom solution on top of IIS to Plone. Plone now serves 50GB of environmental data, maps and figures.
Plone powers 50GB of environmental data, maps and figures

EEA figures

Recently we upgraded our customers website for large data handling. The main development of the product eea.datatservice was done by our colleagues at Eau de Web in Romania, and our part was to prepare Plone 2.5 to handle large data.

Back to the future - blobs in Plone

The first solution for external storage was iw.fss, but due the cluster setup on our production server, we went with the blob approach because it is more future proof since it is part of the upcoming Plone 4 and doesn't require any shared read/write storage for all instances compared to iw.fss.

We tried out the existing branch of blob support for Plone 2.5 but it was old and lacked functionality, so we had to cut a new branch to backport the latest plone.app.blob. While Andreas Z was working on it we where back-porting the code more or less day by day :) Today the blob support for Plone 2.5 allows files and images to be stored outside ZODB in blobs.

In eea.dataservice we have custom content types with large files and images where some are 18000x16000px! All these maps and figures are converted in to number of different formats. All the images are scaled in different sizes and everything is stored in blobs. To store scales in blobs we backported plone.app.imaging which is used in Plone 4. 

Fast but small servers - can't cache everything

In our cluster that is running on blade servers with very little local storage, we have to consider what to cache and where. The large storage on these servers is mounted from SAN, which is a fast and secure storage. The blob storage is mounted on the machine that runs the ZEO and additional to that we have smaller blobcaches for the instances on each machine, 25GB/machine. Since we can't fit the whole blobstorage in cache we clean it manually with a cron. In the newer ZODB3 and Plone there is configuration for automatic cleanup but we can't use it with Plone 2.5. This manual cleaning outside Plone has raised an issue where an instance expects the blob in the cache but it's not there. This happens if the object referring to the blob is in the object cache of the instance. If it is not the problem never arise since the blob is reloaded from ZEO. We are now testing different configurations for the object cache size before we try to catch those exeptions and try to reload the blob from ZEO instead.

 

Open source

All code used for this website is open sourced and available in Plone or collective repositories and Eionet SVN for i.e eea.dataservice

Plone grew with 50GB of data and 20% more traffic

www.eea.europa.eu the Plone 2.5 site is now getting 20% more traffic and with 50GB more data than before the migration. Next step will be to migrate the multimedia and other large content to blobs which will probably free almost 10GB from the Data.fs.

Want to know more?

Please contact Sasha Vinčić for more information. Contact details on the right.

Filed under: , , ,
comments powered by Disqus