How to Curate Facts with BioKC and Explore BioKB

Find out more about how to use BioKC and BioKB. If you have a question that isn't answered here, don't hesitate to get in touch.

BioKB is a platform hosting a text mining pipeline to build large knowledge base of biological facts extracted from scientific publications as well as this frontend to query the knowledge base. The text mining component analyzes the articles content and extracts relations between a wide variety of concepts, extending the scope from proteins, chemicals and pathologies to biological processes and molecular functions. Extracted knowledge is stored in a knowledge base publicly available for both, human and machine access, via this web application and SPARQL endpoint.

BioKC is the curation tool for BioKB, which benefits from the large knowledge base of text mined publications already available on this site. BioKC enables the user to construct building blocks of systems biology models and allows the user to annotate them with human-provided and machine-identified literature evidence. Knowledge representation in BioKC follows the SBML standard in formalising elements of a given model, their relationships and annotations. To access BioKC's functionality, simply register for and log into a BioKB user account.

BioKC

Curating Facts


On top of the home page you will find a link to login or register.

After registration, a group will be created for you in which you can start creating your own facts and populate them.

Once logged in, you can save searches and create facts in BioKC.

A fact is similar to SBML model. It contains Species, Compartments and Reactions. Each fact belongs to a single group of facts, in which users can have different roles.

We call the process of building the fact, curation. Only the users with the role curator can curate the model.

We can collect sentences found in BioKB and assign them to different parts of the fact model. We call this process Annotation. It is also possible to add custom sentences. Only the users with the role annotator can annotate the model.

Example of a fact as seen in the Fact view page.

How do I work on a fact?

Also, what do Annotator's Mode and Curator's Mode mean?

In order to modify the different parts of the fact, two modes can be enabled by clicking on the button with the icon . There you will find Annotator's mode and Curator's mode. The access to these modes is restricted by the user group roles.

Users with annotation permission can start the annotation mode in order to assign evidence from BioKB to different parts of the fact model.

Users with curation permission can start the curation mode to modify the different elements that compose the fact model. Curators can add and modify Species, Compartments, Reactions as well as remove them from the fact model.

⚠️ Important ! A warning will be prompt if another user is already curating or annotating a fact. It is recommended that only one user curates/annotates the fact at the same time. However, nothing stops you from having multiple users curating/annotating the fact at the same time. Keep in mind that curation from multiple users might produce undesired and inconsistent results.

Curators and Annotators can play a fact in different modes depending on their roles.

While on Annotation Mode, sentences found in BioKB can be assigned to different parts of the model.
While on Annotation Mode, a box on the top right corner will indicate this mode is enabled. Information regarding the annotations will be shown in this menu. This menu will be available as you browse BioKB.

Similarly, the Curator's Box will be shown while the fact is being curated. Both menus allow the user to stop the annotation/curation process.

When you login, the default working mode is the basket mode. This basket will show up only when you're not annotating or curating any fact (see playing a fact).

On co-occurrences or relationship pages like this, this icon on the right of the sentences let's you add it to the basket.

You can find this button while browsing BioKB co-occurrences and relationship pages.

Added sentences will appear in the top right menu with a link to their original page. Entities from sentences will be listed too. The main entities of the pages you visited will be highlighted in red.

Once you have finished gathering evidences, click on Go To Basket to return here and organise evidence into facts.

The top right will show a menu with the collected sentences.

Once you are back here you can start creating Species, Compartments and Reactions for your facts.

Try dragging entities from the left side to the species for faster creation. Watch out for the consistency warnings! You will see the following indicator when an element requires your attention ⚠️🤖 .

Drag entities to the Species column for faster creation.

You can enrich your elements with identifier annotations.

Tips! Use the copy    icon for faster element creation.

And if you rush, use the delete  icon to remove elements.

Add Species, Compartments and Reactions. Remember that every Species needs a Compartment, but multiple Species can share the same Compartment.

When you have finished creating the elements of your model, you are ready to create a new Fact and assign reactions to it.

Drag reactions to the fact in order to assing them. Again, watch out for the consistency warnings! ⚠️🤖

Do not forget to assign a group, a name and a description to your fact before saving it!

Drag reactions to your fact.

You can assign the sentences you found to the fact you are creating by dragging them to the fact.

Once you are finished, click the button with the save    icon to persist your fact.

Assign evidence from the basket to your fact.

On the sidebar you will find the button Import SBML that will redirect you to the import SBML page.

There you can either open a SBML file or paste the content of it in the text area. Additionally, you will have to provide a name and prefix to the fact.

At this point you have two options:

  • You can choose to import the whole SBML model as a fact by clicking Import model as a fact which will redirect you to the curation page to further work on the fact.
  • Alternatively, you can edit the model components in the basket by clicking Edit model in the basket. Once the components look as desired a fact can be created and further annotated in the curation page.

The curation mode allows the user to assing identifiers to the elements of a fact by clicking in plus button. You can choose a namespace from those available at identifiers.org.

Note that this task is included in the curation role because it entails editing the elements of a fact in the curation page. Conversely, the annotation mode allows for the annotation of the fact from elements found in BioKB.

The curation interface allows the user to create versions of a fact by clicking in the tab with the icon . These versions can be either made public or kept private. Release notes can be added to indicate the changes between versions.

Only users with a curator role can perform this action.

Note that if you decide to make a Fact public, a stable identifier will be linked to the current version of your fact. This opperation is irreversible and that this fact cannot be removed afterwards. The contact details of the group members will appear publicly as maintainers of this fact to facilitate being contacted by other researchers (see Fact Search).

Roles provide different sets of actions. The roles are assigned per group of facts. Therefore, each user can have different set of roles for each group of facts

Reader

  • Read only access to facts.

Annotator

  • Permission to annotate the facts with sentences from BioKB or the Custom Annotation Tool.

Curator

  • Permission to change the elements that compose a fact.
  • Permission to add identifiers to the elements that compose a fact.
  • Permission to issue a version of a fact.

Manager

  • Administrators of the group.
  • Can delete facts.

On the entity pages you will find a save button like this save.

This button allows you to save the filter parameters of the Incoming/Outgoing relationships and Co-occurrences tables to facilitate going back to the same results.

You can browse and manage your saved searches in the sidebar and in your user details.

BioKC allows adding a custom sentence that you may not be able to find in BioKB. For that, click on the icon and enter the text of your sentence. You will be able to simulate the tagging of entities clicking on tag sentence.

Afterwards, you can fill details like journal title and publication title by introducing the DOI, PMC number or other publication identifiers of the publication. Finally, click on assign annotation to model to add it.

The home page of BioKB allows searching public facts based on identifiers.

To do so, go to the Fact Search tab in the home page and introduce an identifier. Facts including the selected identifiers will be included in the search results. The facts from the results can be exported to SBML format.

BioKB

Browsing Text Mining Results


Entity Search

First, start with a simple search on the home page. You will be redirected to the entity view.

For more advanced searches, try the advanced search form. You will also be redirected to the entity view, but this time the results will be filtered with your search choices.

Entity View

The search form will redirect you to the Entity View where you can start drilling down into the results of your search.

Entity view consist of 3 sections: Co-occurrences, Relationships and Visualization.

Co-occurrences

This section shows entities appearing together with the searched entity.

Relationships

This section shows relationships from extracted sentence events in which the searched entity appears. Contrary to co-occurrences, relationships have directionality, either incoming or outgoing, and a relationship type.

Visualization

The graph shows the combined network of incoming and outgoing relationships. Please note that at the moment, it only considers the relationships appearing in the first page of each table.

Entity

  • This is the primary page type, displaying all the information in the knowledge base about this particular entity
  • Users can download the full search results for this entity, either as RDF triples or as a CSV of "subject-predicate-object" triples
  • Logged in users can save searches to their dashboard from here.
  • Example: /entity/DOID_2841

Relationship

  • Relationship pages provide all the evidence, grouped by publication, for a specific relation between two entities.
  • Logged in users can add statements to their basket from here.
  • Example: /relationship/GO_0046960-increases-DOID_2841

Cooccurrence

  • Coocurrence pages provide all the instances of two given entities occurring together, grouped by publications.
  • Logged in users can add statements to their basket from here.
  • Example: /cooccurrence/DOID_2841-10090

Topic

  • Topic pages are pre-built pages focusing on an area of special interest. The only topic page at present is for Coronavirus infection

BioKB is not just this very website, is also a graph database that stores the results of the text mining pipeline. You can browse this database using SPARQL queries via the SPARQL Endingoint. If you are familiar with RDF, below you can find an schema of the ontology.

Entity: an entity is any kind of tagable concept, including diseases, species, proteins, phenotypes, chemicals, tissues, biological processes, cellular compartments and molecular functions.

Cooccurrence: a cooccurrence refers to any instance of two entities that are present near each other in a publication, usually within the same sentence. Cooccurrences are non-directional and do not automatically imply a semantic relationship between two concepts.

Relationship: a relationship between two entities is a specific type of directional cooccurrence whereby two entities occur in close proximity within the same sentence and the BioKB text mining pipeline was able to identify a directional linking concept such as "increases", "bindsWith" or "regulates" between them. The extraction of relationships requires the elaboration of complex text mining rules, in particular for relationships between concepts such as proteins and diseases, which are often described in publications in complex, multi-clause sentence structures. The absence of tagged relationships for cooccurring concepts should therefore not be taken as a definite lack of relationship.

Class hierarchy in BioKB ontology.
Object properties in BioKB ontology.
Data properties in BioKB ontology.

Other useful help


The Systems Biology Markup Language (SBML) is a file format for representing computational models in a declarative form that can be exchanged between different software systems. SBML is oriented towards describing biological processes of the sort common in research on a number of topics, including metabolic pathways, cell signaling pathways, and many others. By supporting SBML as an input/output format, different tools can all operate on an identical representation of a model.

BioKC stores facts in a data model similar to SBML and allows exporting a fact in SBML format. See SBML FAQ for more questions.


Excerpt from: Systems Biology Markup Language (SBML) Level 2 Version 5: Structures and Facilities for Model Definitions

E + S k off k on E S k cat E + P

In this example, the model has the identifier EnzymaticReaction. The model contains one compartment (with identifier cytosol), four species (with identifiers ESPS, and E), and two reactions (veq and vcat).

The elements in the listOfReactants and listOfProducts in each reaction refer to the names of elements listed in the listOfSpecies.

The correspondences between the various elements is explicitly stated by the speciesReference elements.

The model also features local parameter definitions in each reaction. In this case, the three parameters (kon, koff, kcat) all have unique identifiers and they could also have just as easily been declared global parameters in the model.

Local parameters frequently become more useful in larger models, where it may become tedious to assign unique identifiers for all the different parameters.


From the curator's perspective, BioKC only handles the following elements.

For now Species, Compartment, Reaction, SpeciesReference, ModifierSpeciesReference and some parts of Model are supported.

The whole SBML specification can be found in their website http://sbml.org/.


Species

A pool of entities of the same species type located in a specific compartment.


Compartment

A well-stirred container of a particular type and finite size where SBML species may be located. Every species in a model must be located in a compartment.


Reaction

A statement describing some transformation, transport or binding process that can change the amount of one or more species.

For example, a reaction may describe how certain entities (reactants) are transformed into certain other entities (products). Reactions have associated kinetic rate expressions describing how quickly they take place. In SBML, the rate expressions can be arbitrary mathematical functions.

A reaction represents any transformation, transport or binding process, typically a chemical reaction, that can change the quantity of one or more species. In SBML, a reaction is defined primarily in terms of the participating reactants and products (and their corresponding stoichiometries), along with optional modifier species, an optional rate at which the reaction takes place, and optional parameters.

These various parts of a reaction are recorded in the SBML Reaction object class and other supporting data classes, defined in figure.

The lists of reactants, products and modifiers. The species participating as reactants, products, and/or modifiers in a reaction are declared using lists of SpeciesReference and/or ModifierSpeciesReference instances stored in listOfReactants, listOfProductsand listOfModifiers.


Model

Only one instance of a Model object is allowed per instance of an SBML.

Model serves as a container for components of classes FunctionDefinition, UnitDefinition, CompartmentType, SpeciesType, Compartment, Species, Parameter, InitialAssignment, Rule, Constraint, Reaction and Event.

Instances of the classes are placed inside instances of classes ListOfFunctionDefinitions, ListOfUnitDefinitions, ListOfCompartmentTypes, ListOfSpeciesTypes, ListOfCompartments, ListOfSpecies, ListOfParameters, ListOfInitialAssignments, ListOfRules, ListOfConstraints, ListOfReactions, and ListOfEvents.

The “list” classes are defined in the Figure. All of the lists are optional, but if a given list container is present within the model, the list must not be empty; that is, it must have length one or more.