Google Summer of Code 2010 project ideas
See also: BIREME GSoC Application
For further information, please join our GSoC mailing list.
CDS/ISIS is a document-oriented database developed by ILO, Unesco, BIREME and others since the 1960s (see AboutIsis).
Currently it is widely used in developing countries, for large-scale bibliographic databases such as Scielo and Lilacs and for online catalogs in thousands of libraries, from large to very small. It runs on Linux, Unixes and Windows.
Generic ISIS ideas
The following ideas are applicable to ISIS databases in general, and can be developed in Python or Java, depending on mentor-student agreement.
- Generic Loader: implement a schema definition Web interface similar to the one on ABCD, and generate CRUD on the Web, using ISIS-NBP or Hathor as backend.
- DataGuides for ISIS: A DataGuide is a method for extracting schema information from records in a semistructured database. It is described in the paper "DataGuides: Enabling Query Formulation and Optimization in Semistructured Databases" by Goldman and Widom.
The ISIS Network Based Platform is an architecture and a set of tools for managing ISIS databases in a networked environment. It is implemented in Python.
- Integrate Lucene for reference lookups
- Implement locking for concurrent access
- Adapt Django ORM to bind to ISIS-NBP native database (.MSF files) so that its self-generated admin GUI can operate seamlessly over ISIS-NBP.
- Isolate the ISIS-NBP datamodel (Python classes) so that it can be bound to either to a set of MasterFiles? (ISIS databases), to an OODB (like ZODB) or even to a relational DB. This project would be a cross-model ORM.
- Create an infra-structure to validate and stress the variable-size MasterFiles? in ISIS-NBP. Today, ISIS-NBP uses to size-definitions, but its binary files have the binary data layout parameterized, although this feature was never properly tested.
- Add a recommendation system to ISIS-NBP facilities.
- Create ISIS-NBP Cells for versioning and caching using the gateway API. (refer readers to the ISIS-NBP Vision Report).
Hathor ISIS library
Hathor is a set of tools and sample GUI apps for managing ISIS databases. It is implemented in Java.
- port the Formatting Language sub-system from ISIS-NBP
- integrate Lucene for reference lookups
- Django-Jython integration: create a demo app using Django on Jython as Web front-end, with a Hathor backend
The Clinical Trials Register Platform is a an application under development for the permanent public registration of clinical trials, following WHO guidelines and intended for deployment in Brazil and other countries. It is 100% multilingual, both at the user interface and content levels, and is built with Django using a relational back end.
- port the same app to a CouchDB backend, to make it easier to adapt to different record structures.
About SciELO Metodology
Projects for SciELO Metodology
- Author Desambiguation for SciELO Articles and articles reference. This work is mainly related to develop a database processing to desambiguate the authors name and affiliations stored in SciELO database envisaging the integration of these data with other authors databases.
- Classify the content of SciELO Pages with RDF Schema envisaging developments in a semantic web context. Develop at least 3 services using the RDF content.
- Technological update of SciELO Platform. The SciELO website was developed with PHP4, and ISIS databases at twelve years ago. Intending to update the technology behind the SciELO site the main changes are:
- Test the feasiblity of migrating databases from isis to coachdb to be used in the public website".
- Migrate the base programing language to PHP5 or Python. The programing language need to be defined.