OSCAR3 is a tool for shallow, chemistry-specific parsing of chemical documents. It identifies (or attempts to identify):
In addition, where possible the chemical names that are detected are annotated with structures, either via lookup or name-to-structure parsing ("OPSIN"), and with identifiers from the chemical ontology ChEBI.
OSCAR3 also includes the OSCAR Server, a Jetty-powered set of servlets. These provide the following services:
OSCAR3 files may be downloaded from the sourceforge download page. As well as OSCAR3 itself, two OSCAR3 components have been separated out, and made available as separate modules: OPSIN, a chemical name-to-structure converter, and ChemTok, a tokeniser optimised for text containing chemical names.
OSCAR3 is in the process of being superseded by OSCAR4 which is being developed here.
If you use OSCAR3 to produce results for publication, then please cite us:
High-Throughput Identification of Chemistry in Life Science Texts Peter Corbett and Peter Murray-Rust CompLife 2006, LNBI 4216, pp. 107-118, 2006.
Semantic enrichment of journal articles using chemical named entity recognition Colin R. Batchelor and Peter T. Corbett Proceedings of the ACL 2007 Demo and Poster Sessions, pages 45?48, Prague, June 2007.
If you use SAPIENT to produce results for publication, then please cite :
Semantic Annotation of Papers: Interface and Enrichment Tool (SAPIENT). Liakata M., Claire Q and Soldatova L. N. (2009) Proceedings of BioNLP 2009, p. 193--200, Boulder, Colorado.
This release includes the following improvements:
Extensive improvements to OPSIN
subtypes - is a chemical name referring to a specific whole compound, a class of compounds, or part of a compound
A new, improved method of using chemical names from PubChem for name resolution
Lots of general code clean up, bug fixes, etc.
This fixes a small but critical bug in alpha 3 which caused workspaces to be set up badly. A bug concerning the resources browser was also fixed.
Much like alpha2, this release adds extra documentation, algorithm improvements and some UI tweaks. In particular OSCAR3 can now give confidence estimates. Another improvement is in the search UI.
This release adds extra documentation, algorithm improvements and some UI tweaks.
OSCAR3 files may be downloaded from the sourceforge download page.
This release has been made possible by a lot of different people. The OSCAR project has been going for several years at various rates, under the supervision of Prof. Peter Murray-Rust, and has had input from a large number of different contributors (these are listed in the THANKS.txt files in the distributions). However, I'd like to single out one organisation: the Royal Society of Chemistry. The OSCAR project started out with summer students funded by the RSC, and as well as providing considerable funding and test corpora to the project, they have also been able to dedicate large amounts of staff time and expertise to it. Getting this far (and there's still a way to go before the full release) would not have been possible without them.
Recent performance improvements have been made possible with the use of the excellent YourKit profiling tool. YourKit have kindly agreed to supply the OSCAR3 developers with free licenses to continue this profiling work.
YourKit is kindly supporting open source projects with its full-featured Java Profiler. YourKit, LLC is the creator of innovative and intelligent tools for profiling Java and .NET applications. Take a look at YourKit's leading software products: YourKit Java Profiler and YourKit .NET Profiler.