Content-based image retrieval (CBIR), also known as query by image content (QBIC) and content-based visual information retrieval (CBVIR) is the application of computer vision techniques to the image retrieval problem, that is, the problem ofsearching for digital images in large databases. Content-based image retrieval is opposed to traditional concept-based approaches (see Concept based image indexing). "Content-based" means that the search analyzes the contents of the image rather than the metadata such as keywords, tags, or descriptions associated with the image. The term "content" in this context might refer to colors, shapes, textures, or any other information that can be derived from the image itself.
CBIRest API is:
If you are looking for a Content-based image retrieval Java library, you should see CBIRetrieval Java lib. It provides the same functionalities as the CBIRest API, but you can use it as a simple app/server (command line) or as a JAR in your own JVM app/server (java import).
CBIRest is not only an REST API, its a webapp too. The application is packaged with a web client. This is usefull for management task (add, delete images,...) and to perform some tests.
The web application provides some functionalities:
Redis is mandatory if you want persistance. If you use Memory mode, you will lost your data if you reboot the app.
The retrieval-*.war file from the zip is a war with an embedded TOMCAT server (for quick install). The other war (.war.original) can be install inside a web server/servlet container (only test with tomcat).
unzip CBIRest-$VERSION.zip
cd CBIRest-$VERSION
java -jar retrieval-$VERSION-SNAPSHOT.war --spring.profiles.active=prod --retrieval.store.name=MEMORY
./redis-server
When you are ready, change your user/admin password (user/user, admin/admin by default)!!!
java -jar retrieval-$VERSION-SNAPSHOT.war
--spring.profiles.active=prod
--retrieval.store.name=MEMORY
--retrieval.dataset.load=true
--retrieval.dataset.path=$PATH
mvn -Pprod package -Dmaven.test.skip=true
Configuration can be done on multiple level.
Main configuration flags for CBIRest application. Just append these config params after the "java -jar..." command when you launch the app. E.g. --retrieval.store.name=MEMORY.
Name | Values | By default |
---|---|---|
--retrieval.store.name | MEMORY (by default in dev), REDIS (by default in prod) | Change the database engine. Memory is not persistant! |
--retrieval.dataset.load | true, false (by default) | Flag if the app must index a dataset at the startup (see "Perform a quick test" section) |
--retrieval.dataset.path | A path | Dataset path, only usefull if --retrieval.dataset.load is true |
CBIRest API is a HTTP API for the CBIRetrieval Java lib. Most of the configuration are defined by the configuration files from CBIRetrieval Java lib: config/ConfigServer.prop and config/ConfigClient.prop. Check the config/*.prop files inside your installation package to have the last version. The full documentation is available on the CBIRetrieval Java lib wiki.
You can change these parameters by editing the ConfigServer and configClient.prop files. Some of these configuration can be overrided by the CBIRest API configuration flags. For the default value, check your Config*.prop files.
We just list here the most important flags:
Name | Values | By default |
---|---|---|
Database config | ||
STORENAME | MEMORY, REDIS | Change the database engine. Memory is not persistent! |
REDISHOST | IP, server name,... | If you use Redis, Redis host. |
REDISPORT | Port | If you use Redis, Redis port |
Quality VS speed config | ||
NUMBEROFPATCH | Number | Number of patch (N) for index request picture. HIGH = better quality, bad perf |
NUMBEROFTV | Number (max 50) | Number of test vector (T), HIGH = better quality, bad perf |
Location | ||
VECTORPATH | Path | Path to the tests vectors (testsvectors/) |
You first need to index images on the server. CBIRest is incremental, this means that you can add new images to the index all over the time. Index image can be done with a POST /api/images. You need to provide the image binary data as a multipart content. You may specifiy the image id (by default: a random number), the storage (by default: a random storage) and some properties (by default: an empty map).
Simply provides an image and the storages (by default the search will be done on all available storages). The service can be reach thanks to a POST in /api/search. The CBIRest server provides a list of similar pictures (sort by similarities). For each image, the server return the id, storage and properties.
Each image that appears in the CBIR results needs to be indexed to the server first. The server extracts data and index these information in a specific storage database (MEMORY or REDIS). An image is identified with an id (long number) and you may add extra information (like date, path,...). Each time a new image is added, the server saves its thumb on the filesystem too.
Description | Verb/Path | Params | Response |
---|---|---|---|
Get all images | GET /api/images |
List of images data:
|
|
Get all images for a storage | GET /api/{storage}/images |
|
List of images data:
|
Get a single image data | GET /api/images/{id}, GET /api/storages/{storage}/images/{id} (better for performance) |
|
Image data:
|
Add a new image | POST /api/images |
|
Image data:
|
Delete an image image | DELETE /api/storages/{storage}/images/{id} |
|
Deleted Image data:
|
Retrieve image thumb | GET /api/images/{id}/thumb |
|
Image thumb |
Index a full set of data. Quick way to index a lot of images. | POST /api/index/full | JSON with [{"id":...,"storage":...,"url":...},...] |
A storage is a virtual space on the server. You can store all images inside a single storage but its not a perfect solution. The storage concept allows you to:
Description | Verb/Path | Params | Response |
---|---|---|---|
Get all storages | GET /api/storages |
List of storages data:
|
|
Get a specific storage | GET /api/storages/{id} |
|
Storage data:
|
Create a new storage | POST /api/storages | JSON string in body: {"id":"$NAME"} |
Storage data:
|
Delete a storage. All images data in this storage will be deleted. | DELETE /api/storages/{id} |
|
The search API provides methods to search images similarities.
Description | Verb/Path | Params | Response |
---|---|---|---|
Search for similar images | POST /api/search |
|
Request id and results under "data":
|
Search for similar images | POST /api/searchUrl |
|
Request id and results under "data" (for each image: id, storage, properties and similarities):
|
Image data needs to be store in very large hash table. You can choose between MEMORY or REDIS implementation.
The CBIRetrieval Java lib used in CBIRest server support distributed deployment. In the future, you should be able to run multiple CBIRest server and perform search simultaneously on all of them. By now, there are only working prototype but its not ready for production.
Cytomine is a rich internet application for visualization, collaborative annotation, and automatic analysis of large-scale bioimages.
The goal of the CYTOMINE project is to develop a modern internet application using data mining for large-scale bioimage exploitation in order to help life scientists to better evaluate drug treatments, understand biological processes, and ease diagnostic. Our application uses fully web-based technologies without the need for the end-user to install proprietary softwares to visualize, annotate, and analyze imaging data. Although our initial focus was on lung cancer and inflammation in cytology and histology images, we seek to provide a generic software and tailored services for other diseases, biological processes, and also other types of high-dimensional imaging data.
One of the key concept of Cytomine is the ability to draw annotations on an image. You can annotate your images by drawing multiple regions of interest (ellipses, rectangles, polygons, freehand drawings) and associate them to user-defined terms from structured vocabularies (ontologies).
Each time a user draw an annotation, we retrieve the annotation images and we send an index HTTP request to the CBIRest API server.
Each time a user click on an annotation, we retrieve the annotation images and we perform a search request on the CBIRest API server. This server responses a sorted list of similar annotations id.
Thanks to the annotations lists and their respective similarities, Cytomine compute the “suggested terms”. If the CBIR results contains a lot of annotation from term X with a high similarities rate, the system suggests this annotation (e.g. Tumor: 91%, artefact: 5%,...).
In Cytomine, images (and their annotations) are stored inside Projects. A project has its own ontology (terms list) but an ontology can be associated with multiple projects. This means that its not necessary to search on annotations similarities from project with different ontology. In Cytomine, we create one storage per project and we index an annotation in its project storage. Thanks to that we can search similar annotations only on the current project or on all projects sharing the same ontology. Less computation, better performance!
Issues/questions can be post on the github issue page
There are a lot of improvement that can be done:
List of contributors:
This software is an optimized, multi-threaded, implementation of the algorithm described in this original research article:
Incremental Indexing and Distributed Image Search using Shared Randomized Vocabularies
/*
* Copyright 2015. Authors: ROLLUS Loïc.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/