Indexing of micro/nano objects.
Exact/fuzzy search and recognition of micro/nano objects.

The NBITSearch module allows to create indexes, each of which supports an efficient exact and fuzzy search in billions source micro/nano objects (microobjects).

Microobjects can be indexed by mapping them to NBIT-primitives.

To study the structure of a matter the methods of Small-angle scattering (SAS) are widely used. They are based on scattering of electromagnetic radiation or particle beam (electrons, neutrons) on the irregularities of the matter.

Information about a particle can be obtained by analyzing the dependencies of the intensity distribution of scattered radiation on the structure of the scatterer.

Two-dimensional SAS – "2D-SAS"

In methods of proteins classification, when the atomic model of the protein is unknown, information about its structure can be obtained by comparing the functions of the intensity I(s) on the modulus of the scattering vector (momentum transfer) s = 4 * PI * sin(u) / x, where 2u – scattering angle, x – wavelength.

This method is a two-dimensional – 2D-SAS, since it is reduced to preparation and analysis of two-dimensional experimental curves I(s).

Fig.1 Comparison of proteins by comparing
the functions of intensity I(s). See Rapid protein classification.
Similar functions correspond to proteins that have similar form.

In determining whether a given function is similar to another function, a question immediately arises about what measure of similarity should be considered appropriate. The problem of choosing such a criterion is always extremely difficult.

In studies described in the Rapid protein classification, the method of least squares was used for comparing the I(s).

The method of least squares is a method of approximation of functions that are typically used to find analytic functions which describe approximately the experimental curves.

The criterion of this method consists of minimization of the squared deviations of the approximating function from the original curve over the whole of approximation.

Such a criterion imposes strict limits on approximation and considerably narrows the set of functions, which could be considered similar to the source pattern function, while basing on other criteria.

For example, there is a common situation when for various reasons the experimental curves have the pulsed anomalies (see Fig.2). These anomalies may be the result of external factors, accidental or insignificant.

Fig.2. Comparison of the experimental curves.
One function on the right has the pulsed anomalies.
Despite this, these functions are similar to each other.

The method of least squares is not suitable for identification and similarity of these curves.

NBITSearch allows to carry out adjustable fuzzy search for objects similar to given ones.

Adjusting of search inaccuracy can be realized by means of interface like similar adjusting is realized in standard oscillograph by means of spinning knobs.

If the database contains a huge number of functions associated with any microscopic objects, then, inter alia, the question immediately arises whether the queries to this database run fast enough.

Three-dimensional SAS – "3D-SAS"

3D-SAS in contrast to the 2D-SAS is reduced obtaining and analyzing three-dimensional surfaces of the experimental radiation intensity I(v,x), where v = 2u – scattering angle, x – wavelength.

Surfaces of radiation intensity I(v,x) are matrixes which are NBIT-primitives.

The NBITSearch allows to compare the structure of proteins by a form of matrixes more accurately than by a shape of functions since the matrix describing the objects contains de facto significantly more information than the functional curves, which describe the same objects.

Examples of the regularities in matrixes are shown on Fig.3.

Fig.3. Examples of the regularities in matrixes.

Search Technology developed with support from FASIE
foundation formed by the Government of Russian Federation
Novosib-BIT LLC 2004 - 2017