>, Tutorials>Using the full text search engine

Using the full text search engine

This article explains how you can use the Infradox full text search engine

Recommended reading

Introduction

The file metadata can be extracted from files (e.g. from embedded IPTC, XMP or Photoshop), it can be supplied as sidecar XML, it can be edited online, it can be imported from CSV files that you upload and so on. In any case, all the file metadata is stored into the Infradox database, and all the database fields are full text indexed – i.e. searchable. This includes file names and -numbers too. Your metadata may be enriched with e.g. synonyms and/or related terms, but also with a number of unique and automatically generated codes to aid search filtering and permissions. Such codes are also part of the index.

Searching is simple and the client facing websites can be configured to show various functions to aid in creating more complex search queries and filters – you can create complex Boolean search queries without having to know how full text searching works, simply by entering words in the search boxes and selecting the appropriate radio buttons. The software will then create the correct Boolean search for you.

However – especially for staff members – it’s important to understand the full text search engine as it lets you create powerful search queries that you may need to use at one point. For example, let’s say you need to find all files from all photographers in a specific supplier group that have ranking value 3 and that were uploaded in 2019. If you understand how to use the full text search engine and the filter codes, then queries like these are easy to create

Boolean operators

Searching is as simple as typing one or more keywords in the search box and pressing enter. The search engine will return all files that have the words that you’ve typed. The Infradox search engine supports Boolean operators AND, OR, AND NOT, NEAR and LIKE. If you search for more than one word, this is the same as using the AND operator. For example, searching for cat dog is the same as searching for cat AND dog. So you’ll find files that have both words. If however you want to find files that have either word, then use OR. E.g. cat OR dog. You can use multiple Boolean operators too. To find files that have either cat or dog, but that also have man and park, enter (cat OR dog) AND (man AND park). The brackets are essential in this example. To use the previous query but without files that have the word Amsterdam, enter ((cat OR dog) AND (man AND park)) AND NOT amsterdam.

Below are a few more examples of how you can search by typing queries in the search box.

  • dog OR cat
    to find files with the word dog or the word cat
  • dog AND cat
    to find files with both the word dog and the word cat
  • dog cat
    is the same as dog AND cat
  • (dog or cat) and park
    to find files with the words dog or cat and the word park (note the use of the round brackets)
  • lion and not africa
    to find files with the word lion but not if the word africa is also present
  • (tiger or lion) and not zoo
    to find files with the words tiger or lion but not if the word zoo is also present

Round brackets are used to control how (and in which order) Boolean expressions are evaluated. I.e. in (dog or cat) and not park, the search engine first looks for files with dog or cat, and then it excludes the ones that also have the word park.

The search engine can be configured to exclude specific fields from the index and you can also configure word stemming. The latter is explained in a separate article: Word stemming.

Wildcards

The Infradox search engine furthermore supports wildcards. You can use a ? (question mark) for a single position wildcard, and * (asterisk) for multiple positions. Have a look at the below examples:

  • dog*
    to find files with the word dog or words that start with dog – e.g. dog, dogs, dogfight and so on
  • *dog
    to find files with the word dog or words that end with dog – e.g. dog, sheepdog, watchdog
  • d*g
    to find files with words that start with d and end with g regardless of the number of characters in between – e.g. dog, dancing, doing, drug
  • d?g
    to find files with words that start with a d, end with a g, and that have exactly one character between the two – e.g. dog, dig
  • d*g?
    to find files with words starting with d, then any number of characters followed by g and the word must end with any character (but precisely one character) after the g. E.g.  dancings, doings
  • 08??????
    to find files with exactly 8 digit numbers that start with 08

Using filter codes

The Infradox platform uses a number of standardised filter codes as well as custom filter codes that you can create yourself. For an overview of the standard codes and detailed information about this subject, please read Search filters and access & deny codes. A solid understanding of the filter codes is important if you want to be able to create advanced search queries.

You can use the filter codes in search queries and you can also use the wildcards that the search engine supports. But, as opposed to using the # sign in the codes that you’re searching for, you can use the question mark (?) wildcard instead. For example:

  • ((flower and spring and not garden) and @sup110?) and (Q1901* or Q1902*)
    This query returns files with the words flower and spring but not also having the word garden – from supplier 110 and if uploaded in either January or February 2019.
  • @SG20? and Q19* and 055?????
    This query returns all files from all photographers in supplier group 20 (@SG20?) that were uploaded in 2019 (Q19*) and that have a number that is 8 digits long and that starts with 055.

Searching for phrases

Searches using two or more words result in files containing all specified keywords. To have the search engine treat the words as linked search words, you should surround them with quotes. E.g “african elephant” displays images of african elephants but not images including both keywords independent of one another. For instance, an image of an Indian elephant at the Nairobi Zoo may include both the keywords african and elephant but would not be displayed as a result of a search for “african elephant”.