Chapter 3. The Metaweb Query Language

This chapter explains the Metaweb Query Language, or MQL, [7] which is used to express Metaweb queries. This chapter begins with an explanation of the JSON data format, on which the MQL grammar is based. That prerequisite material is followed by an extended tutorial that teaches MQL by example. You are expected and encouraged to run queries and to experiment with your own queries, using a "query editor" program like the one at http://www.freebase.com/tools/queryeditor to submit your queries to Metaweb and obtain the results.

This chapter teaches you to write MQL queries, but does not explain how to issue those queries to and retrieve responses from Metaweb servers: that is the topic of Chapter 4. Also, this chapter does not cover updates, or writes, to Metaweb. Updates are expressed using a variant of MQL that is covered in Chapter 5.

3.1. JavaScript Object Notation

The Metaweb queries and responses we saw in Chapter 1 contained a lot of punctuation: curly braces, quotation marks, colons, and commas. Before we study more queries, it is important to understand this punctuation. Metaweb queries and responses use a plain-text data interchange format known as JavaScript Object Notation or, more commonly, JSON. If you are a JavaScript programmer, then this format will be familiar to you since it is a subset of the JavaScript language. [8] If you are not a JavaScript programmer, the format is easy-to-learn, and does not require the use of the JavaScript language.

JSON is formally described in RFC 4627 (http://www.ietf.org/rfc/rfc4627.txt), and is also documented at http://json.org. The JSON website includes pointers to code, in a variety of programming languages, for serializing data structures into JSON format and for parsing JSON text into data structures. [9]

A JSON-formatted string is a serialized form of an array or object. The array or object may contain numbers, strings, other arrays and objects, and the literal values null, true, and false. These JSON values are illustrated in Figure 3.1 and explained in the sub-sections that follow:

Figure 3.1. JSON Values

JSON Values

3.1.1. JSON Literals: null, true, false

JSON supports three literal values. null is a JSON value representing "no value". The literals true and false a represent the two possible Boolean values.

3.1.2. JSON Numbers

A JSON number consists of an optional minus sign followed by an integer part followed by an optional decimal point and fractional part followed by an optional exponent. This format is the same as the format described for /type/float in Chapter 2. All numbers use decimal digits: octal and hexadecimal notation are not supported.

Figure 3.2. JSON string syntax

JSON string syntax

3.1.3. JSON Strings

A JSON string is much like a string in Java or JavaScript: zero or more Unicode characters [10] between double quotation marks. See Figure 3.2.

A backslash is special: it is an escape character and is interpreted along with the character or characters that follow:

Escape Character
\"

A quotation mark that does not terminate the string

\\

A single backslash character that is not an escape

\/

A forward slash character. Although it is legal to escape the forward slash character, it is never necessary to do so.

\b

The Backspace character

\f

The Formfeed character

\n

The Newline character

\r

The Carriage Return character

\t

The Tab character

\uXXXX

The Unicode character whose encoding is the four hexadecimal digits XXXX. To encode extended Unicode codepoints that do not fit in four hex digits, use two \u escapes to encode a UTF-16 surrogate pair.

3.1.4. JSON Arrays

An array is a comma-separated list of JSON values enclosed in square brackets. See Figure 3.3

Arrays may contain any JSON values, including objects and other arrays. The elements of a JSON array need not have the same type (though in MQL they always do). The following JSON array might be returned in response to a MQL query:

["Outlandos d'Amour", "Reggatta de Blanc", "Zenyatta Mondatta"]

Figure 3.3. JSON array syntax

JSON array syntax

A JSON array with no elements consists of just the square brackets: []. Empty arrays often appear in MQL queries.

3.1.5. JSON Objects

A JSON object is named after the JavaScript object type, and is not very much like the objects of strongly-typed object-oriented programming languages. Instead, think of an object as:

  • an associative array;

  • a hashtable that maps strings to values;

  • a dictionary; or

  • an unordered set of named values.

JSON objects are written as a comma-separated list of name/value pairs, enclosed in curly braces. A name/value pair is a JSON string (the name) followed by a colon followed by any JSON value, which may include nested objects and arrays. See Figure 3.4

Figure 3.4. JSON object syntax

JSON object syntax

Here is an example JSON object (which also happens to be a Metaweb query):

{
  "type" : "/music/artist",
  "name" : "The Police",
  "album" : []
}

JavaScript programmers should note that JSON requires property names to appear within double quotes, even though the JavaScript language does not. Arbitrary whitespace is allowed within JSON objects and arrays, but trailing commas (after the final array element or last name/value pair) are not. An empty JSON object, with no properties at all is simply a pair of curly braces: {}. As we'll see, empty objects are not uncommon in MQL queries.

3.2. Basic MQL Queries

This section is a tutorial that teaches basic Metaweb queries by example, and uses freebase.com as a source of interesting data to query. Try to follow along as you read it by trying out the queries presented. To do this, you need a simple way to submit a query to freebase.com and view the result. You can do this with the Freebase query editor at http://www.freebase.com/tools/queryeditor/.

3.2.1. Our First Query

Let's begin by revisiting the simple query from Chapter 1. We would like to know what albums The Police have recorded:

{
  "type" : "/music/artist",
  "name" : "The Police",
  "album" : []
}

When we run this query, we get the following JSON object in response (some of the album names are omitted here for brevity):

{
  "type": "/music/artist",
  "name": "The Police",
  "album": [
    "Outlandos d'Amour",
    "Reggatta de Blanc",
    "Zenyatta Mondatta",
    "Ghost in the Machine",
    "Synchronicity",
  ]
}

To query Metaweb we tell it what we already know by specifying properties and their values:

    "type" : "/music/artist",
    "name" : "The Police",

And then we tell it what we want to know by specifying properties without values:

    "album" : []

Sending an empty array in a MQL query tells Metaweb that we'd like to have the array filled in.

3.2.2. Query/Response Symmetry

Let's look one more time at the simple "albums by The Police" query and response from above. This time the query and response are presented side-by-side to emphasize that the query and response objects have the same properties, but the response object has values filled in:

Query Result
{
  "type" : "/music/artist",
  "name" : "The Police",
  "album" : []
}
{
  "type": "/music/artist",
  "name": "The Police",
  "album": [
    "Outlandos d'Amour",
    "Reggatta de Blanc",
    "Zenyatta Mondatta",
    "Ghost in the Machine",
    "Synchronicity"
  ]
}

This symmetry of queries and responses is a fundamental and elegant part of MQL. We'll use this two-column query/response format throughout the chapter.

3.2.3. Object IDs

Objects can be given fully-qualified names that can be used as unique identifiers. Query the id property of an object to obtain a unique identifier for it:

Query Result
{
  "type" : "/music/artist",
  "name" : "The Police",
  "id" : null
}
{
  "type": "/music/artist",
  "name": "The Police",
  "id": "/en/the_police"
}

This query includes the same name and type as the last query. But instead of specifying an empty array of albums, it specifies a null id. The null value is our query: this is the field we want Metaweb to fill in. The response looks just like the query, but the null has been replaced with a unique fully-qualified name for The Police.

In addition to querying the id property of an object, we can also use it to uniquely name the object we want. We could rewrite our musical albums query to use id instead of name, for example:

{
  "type": "/music/artist",
  "id": "/en/the_police",
  "album": []
}

Metaweb objects are not required to have a fully-qualified name, but every object is assigned a string of hexadecimal digits that serves as a globally unique identifier or guid. If you query the id property of an object that does not have a fully-qualified name, Metaweb creates a synthetic identifier by prefixing the object's guid with /guid/:

Query Result
{
  "type" : "/music/album",
  "artist": "The Police",
  "name" : "Synchronicity",
  "id" : null
}
{
  "type": "/music/album",
  "artist": "The Police",
  "name": "Synchronicity",
  "id": "/guid/9202a8c04000641f8000000002f9e349"
}

You can use guid-based identifiers as the value of the id property to uniquely identify an object.

3.2.4. Multiple Results and Uniqueness Errors

Now that we know the id of the object representing The Police, let's turn our query around and ask about name and type of the object with that id:

{
  "id": "/en/the_police",
  "name" : null,
  "type" : null
}

We're telling Metaweb what we have (the id) and asking for the values (name and type) that we don't have. When we submit this query, though, it doesn't work. The response envelope looks like this:

{
  "status": "200 OK",
  "code": "/api/status/ok",
  "qname": {
    "code": "/api/status/error",
    "messages": [
      {
        "code": "/api/status/error/mql/result",
        "message": "Unique query may have at most one result. Got 4",
        "info": {
          "count": 4,
          "result": [
            "/music/artist",
            "/common/topic",
            "/music/producer",
            "/music/musical_group"
          ]
        },
        "query": [
          {
            "error_inside": "type",
            "type": null,
            "id": "/en/the_police",
            "name": null
          }
        ],
        "path": "type"
      }
    ]
  }
}

This response includes three properties named code. The outermost one indicates that the query was well-formed and could be processed. The middle code property is specific to our query (which we gave the name qname) and specifies that an error occurred. The messages[0] object provides details, including a more specific error code in the innermost code property. The message property of the message object is a human-readable version of this error code, and the info property provides further details about the error. The query property repeats the query that caused the error, and both the query.error_inside and the path properties indicate which part of the query caused the problem.

What we learn from this response is that Metaweb could not respond to our query because we asked for a single type and it found four types. Let's try the query again. Now we're requesting a single name and an array of types for this uniquely specified object. This query works:

Query Result
{
  "id": "/en/the_police",
  "name" : null,
  "type" : []
}
{
  "id": "/en/the_police",
  "name": "The Police",
  "type": [
    "/music/artist",
    "/common/topic",
    "/music/producer",
    "/music/musical_group"
  ]
}

The Metaweb object we asked about has the name "The Police" and it is a member of four types including /common/topic and /music/artist. Recall from Chapter 2 that /common/topic is a very generic type. Just about every Metaweb object that represents something an end user would have an interest in is a member of this type. The lesson to draw here is that objects almost always have more than one type, and any queries on the type property should use arrays. In general, it is always safe to use [] in place of null in your queries. If there is only one result the array returned in the response will simply have a single element. When you know that there can only be one result, however, it is usually more convenient and efficient to use null.

Uniqueness errors are a common pitfall for developers crafting Metaweb queries. Recall that /type/property allows certain properties to be specified as unique. The id and name properties are unique and can always be queried without square brackets. As we've seen, however, the type property is not unique: objects can (and many objects do) have more than one type. If a property is not guaranteed to be unique, then you should always use square brackets when querying its value.

Although objects can have more than one fully-qualified name, queries of the id property never return more than one of these names. The id property is unique in another, more important, way: no two objects ever share the same id. Therefore, if a query includes an id, you can be confident that no more than one object will match. Therefore, a query like this one is correct:

{
  "id": "/en/the_police",
  "name" : null,
  "type" : []
}

The name property is unique, [11] so it is always safe to query name with null, as we do above, rather than []. On the other hand, the query that we started this tutorial with is risky:

{
  "type" : "/music/artist",
  "name" : "The Police",
  "album" : []
}

This query worked for us: Freebase only knows about one musical artist named "The Police". Note, however, that there is no guarantee that this will always be the case. There is nothing to prevent someone from adding another band named "The Police" to freebase.com. If such an addition were made, our query would suddenly fail.

Depending on the design of your application, a uniqueness failure in this situation might actually be exactly what you want. If you get two results when you expected one, then perhaps the right thing to do is fail and display an error message to the user. In practice, however, most MQL programmers simply get in the habit of enclosing all queries in square brackets:

[{
  "type" : "/music/artist",
  "name" : "The Police",
  "album" : []
}]

When you write queries like this, you must be prepared to handle zero, one, or multiple results.

3.2.5. Nested Sub-queries

Let's find out more about our favorite band. What are the names of the tracks on the album Synchronicity?

Query Result
{
  "type" : "/music/artist",
  "name" : "The Police",
  "album" : {
    "name" : "Synchronicity",
    "track" : []
  }
}
{
  "type": "/music/artist",
  "name": "The Police",
  "album": {
    "name": "Synchronicity",
    "track": [
      "Synchronicity II",
      "Every Breath You Take",
      "King of Pain",
      "Wrapped Around Your Finger",
      "Tea in the Sahara",
      "Walking in Your Footsteps",
      "Miss Gradenko",
      "Murder by Numbers",
      "O My God",
      "Synchronicity I",
      "Mother"
    ]
  }
}

The interesting thing about this query is that it includes a sub-query nested inside curly braces. We're asking for an array of tracks from an album named "Synchronicity" recorded by a band named "The Police".

Now consider the following query which includes a sub-query within a sub-query. It asks: "what artists have recorded the song Too Much Information? How long is the song and what album does it appear on?". Some of the results have been omitted here.

Query Result
[{
  "type":"/music/artist",
  "name":null,
  "album": [{
    "name":null,
    "track": [{
      "name":"Too Much Information",
      "length": null
    }]
  }]
}]
[{
  "type" : "/music/artist",
  "name" : "The Police",
  "album" : [{
    "name" : "Ghost in the Machine",
    "track" : [{
      "name" : "Too Much Information",
      "length" : 222.733
    }]
  },{
    "name" : "Message in a Box (disc 3)",
    "track" : [{
      "name" : "Too Much Information",
      "length" : 222.733
    }]
  }]
},{
  "type" : "/music/artist",
  "name" : "Duran Duran",
  "album" : [{
    "name" : "Duran Duran",
    "track" : [{
      "name" : "Too Much Information",
      "length" : 296.573
    }]
  }]
},{
  "type" : "/music/artist",
  "name" : "Quiet Riot",
  "album" : [{
    "name" : "Alive and Well",
    "track" : [{
      "name" : "Too Much Information",
      "length" : 268
    }]
  }]
}]

Let's take a closer look at this query. It involves three different objects: an artist, an album, and a track. We can't tell Metaweb anything interesting about the album (such as a name or id): just that it contains the song we're interested in. We can't tell Metaweb anything about the artist object either: just that they recorded an album that includes the song. Despite the seeming vagueness of this query, Metaweb has no trouble finding the answer we want.

At first glance, it seems as if the only information we're providing to Metaweb with this query is the track name. But the structure of the query contains additional implicit information. We've specified that the outermost object is a /music/artist. The definition of this type tells us that objects connected via the album property are expected to be of type /music/album. And the definition of the /music/album type tells us that objects connected through the track property should be of type /music/track. These additional constraints give Metaweb enough information to find the information we want.

3.2.6. Inverting Queries

With MQL, there is usually more than one way to write a query. This is especially true when a query contains sub-queries – because of the bi-directional nature of Metaweb links, queries can usually be turned "inside out". At the beginning of the last section we wrote a query to ask for the names of tracks on the album Synchronicity. That was an artist-centric query with "type":"/music/artist" in the outermost query. We can invert the query and get the same information with a track-centric query:

[{
  "type":"/music/track",
  "name":null,
  "album":{
    "name":"Synchronicity",
    "artist":"The Police"
  }
}]

This query returns the same list of track names, but the results are much more verbose, since every track object includes a nested. In this case, the simplest way to obtain the list of tracks we want is probably with a non-nested album-centric query:

{
  "type" : "/music/album",
  "name" : "Synchronicity",
  "artist" : "The Police",
  "track" : []
}

The previous section included another artist-centric query to asked about recordings of the song "Too Much Information". Here's the album-centric version of that query:

[{
  "type":"/music/album",
  "name":null,
  "artist":null,
  "track": [{
    "name":"Too Much Information",
    "length": null
  }]
}]

We can also invert this query into a track-centric form, of course:

[{
  "type":"/music/track",
  "name":"Too Much Information",
  "length":null,
  "album":[{"name":null, "artist":null}]
}]

The track, album, and artist-centric versions of this query all return the same basic information. Which version is best depends on how you intend to use the results. Often this will be a question of which result format most closely matches the data structures used by the application that is making the query.

3.2.7. Asking Metaweb For Objects

In our queries so far, we've used null and [] to ask Metaweb to fill in a single value or an array of values. There are other ways to ask for information as well. Recall the following query:

{
  "id" : "/en/the_police",
  "name" : null,
  "type" : []
}

It asks for the name and types of a unique object. Both the name, and the individual elements of the type array are returned as strings. Recall, however, that the name of an object is of /type/text and that types are of /type/type. /type/text is a value type in the Metaweb object model, but we can treat values as objects if we want to. Let's modify the query to use {} and [{}] instead of null and []. {} asks for a single value, expanded as an object, and [{}] asks for an array of values expanded into objects:

Query Result
{
  "id": "/en/the_police",
  "name" : {},
  "type" : [{}]
}
{
  "id" : "/en/the_police",
  "name" : {
    "lang" : "/lang/en",
    "type" : "/type/text",
    "value" : "The Police"
  },
  "type" : [{
    "id" : "/music/artist",
    "name" : "Musical Artist",
    "type" : ["/type/type","/freebase/type_profile"]
  },{
    "id" : "/common/topic",
    "name" : "Topic",
    "type" : ["/type/type","/freebase/type_profile"]
  },{
    "id" : "/music/producer",
    "name" : "Record Producer",
    "type" : ["/type/type","/freebase/type_profile"]
  },{
    "id" : "/music/musical_group",
    "name" : "Musical Group",
    "type" : ["/type/type","/freebase/type_profile"]
  }]
}

We learn from this query that the English name of the specified object is "The Police". And, in addition to obtaining the ids of the four types of which The Police is a member, we also obtain the human-readable names of those types.

Let's use this query technique to learn more about the tracks on the album Synchronicity. (The result lists only two tracks for brevity.)

Query Result
{
  "type" : "/music/album",
  "name" : "Synchronicity",
  "artist" : "The Police",
  "track" : [{}]
}
{
  "type":  "/music/album",
  "name": "Synchronicity",
  "artist": "The Police",
  "track" : [{
    "id" : "/guid/9202a8c04000641f800000000120b4ca",
    "name" : "Synchronicity II",
    "type" : [
      "/music/track",
      "/music/song",
      "/music/composition"
    ]
  },{
    "id" : "/guid/9202a8c04000641f8000000001275dd7",
    "name" : "King of Pain",
    "type" : ["/music/track"]
  }]
}

This query doesn't actually tell us much about the tracks themselves. We already know the type of the tracks. The id might be useful in future queries, but it doesn't tell us anything about the track. The name is useful, but we could have obtained that without using curly braces, just by querying "track":[].

When you ask Metaweb to fill in empty curly braces for you, it returns all the properties if the value is a value type. The name property of an object is of /type/text, and querying it with {} returns all of its properties. If the property is an object type instead of a value type, then Metaweb returns only the name, type and id properties (all of which are defined by /type/object and are common to all Metaweb objects). That is, instead of using [{}], we could write out the query explicitly like this:

{
  "type" : "/music/album",
  "name" : "Synchronicity",
  "artist" : "The Police",
  "track" : [{
    "name" : null,
    "id" : null,
    "type" : []
  }]
}

If we want to know more about an object than its name, id, and types, then we must refine our query to express exactly what it is we would like to know. Here's how we ask for just the name and length of each of the tracks:

Query Result
{
  "type" : "/music/album",
  "name" : "Synchronicity",
  "artist" : "The Police",
  "track" : [{
     "name":null,
     "length":null
  }]
}
{
  "type" : "/music/album",
  "name" : "Synchronicity",
  "artist" : "The Police",
  "track" : [
    {"name":"Synchronicity II", "length":305.066},
    {"name":"Every Breath You Take", "length":254.066},
    {"name":"King of Pain", "length":299.066},
    {"name":"Wrapped Around Your Finger", "length":313.733},
    {"name":"Tea in the Sahara", "length":255.44},
    {"name":"Walking in Your Footsteps", "length":216.773},
    {"name":"Miss Gradenko", "length":120},
    {"name":"Murder by Numbers", "length":276.8},
    {"name":"O My God", "length":242.226},
    {"name":"Synchronicity I", "length":202.866},
    {"name":"Mother", "length":185.64}
  ]
}

3.2.8. Expanded Values and Default Properties

In this tutorial we've said that we query the value of a property p with "p":null and "expand" that value into an object with "p":{}. This is helpful terminology, but it is actually the opposite of what is really going on. Everything in Metaweb is an object (or, in the case of literal values, can be viewed as an object). When you use curly braces, objects are naturally expressed as objects. When you use null, however, objects are compressed: instead of returning the complete object, Metaweb returns only one property – called the default property – of the object. The query "p":null really means "look up the expected type of the property p, then look up the default property of that type and return the value of that default property of the object that p refers to".

If the expected type of a property is a primitive type, then the default property is value. If the expected type is not a system type (i.e. if it is not in the /type domain) then the default property is name. If the expected type is a system type, then the default property depends on the type but is usually id instead of name.

Default properties are not only used when you ask Metaweb to fill in a null or a [] for you. They are also used when you express the information you already have. Consider the following query:

{
  "type" : "/music/album",
  "name" : "Synchronicity",
  "artist" : "The Police",
  "track" : []
}

This query could also be expressed more verbosely like this:

{
  "type" : "/music/album",
  "name" : {"value":"Synchronicity", "lang":"/lang/en"},
  "artist" : {"name":"The Police"},
  "track" : []
}

The verbose form of the query illustrates the fact that the succinct form relies on default properties. The name property is expected to be of /type/text, whose default property is value. The artist property is expected to be of type /music/artist, whose default property is name.

3.2.9. Review: Asking for Values

If you want to ask Metaweb to return a value, use one of the terms listed in Table 3.1 on the right-hand side of a property name:

Table 3.1. Asking for Values

Term Meaning
null

If the property is of value type, return the value property. Otherwise, the property refers to an object. If the object is of a system type in the /type domain, return the value of its default property (this is usually the id property). Otherwise, return the name of the object.

[]

Like null, but return an array of values instead of a single value.

{}

If the property is of value type, return an object that represents the value. This object will have type and value properties. If the property is /type/text, the returned object will also have a lang property, and if it is of /type/key, it will have a namespace property.

If the property is of object type, return an object that includes its name, id, and type properties. In this case, the term {} is equivalent to: {"name":null,"id":null,"type":[]}.

[{}]

Like {}, but return an array of objects instead of a single one.


3.3. Names, Ids, Keys and Namespaces

Metaweb defines a few different ways to name objects. This section begins with a review of names, ids, guids, keys and namespaces, repeating some of the material from Section 2.3. After this overview, it demonstrates, with queries, many of the important features of names, ids, and namespaces.

Every Metaweb object has a name property which can be used to specify or query a human-readable name (such as "The Police") for the object. Names are of /type/text and have a language (of /type/lang) associated with them. An object may have more than one name, but is only allowed to have one name per language. The name property of an object usually behaves, therefore, as if it has a single value. Names do not uniquely identify single objects: multiple objects may (and often do) share the same name.

In addition to its human-readable name or names, every Metaweb object has zero or more fully-qualified names or identifiers. These are hierarchical names that use the / character in the way filesystems do. They are intended for use by Metaweb developers and are not typically displayed to end-users of Metaweb-based applications. Fully-qualified names are of /type/id, and are values of the id property of an object. /music/album is a fully-qualified name, and so is /en/the_police. A fully-qualified name uniquely identifies a single object, but is not immutable: fully-qualified names can be deleted or re-assigned to other objects. Use the id property to query or specify the fully-qualified name of an object.

An object may have more than one fully-qualified name. In a query, you can specify an object by giving any of its fully-qualified names as the value of the id property. If you query the id of an object that has more than one fully-qualified name, the name that is returned is arbitrary.

Every object in a Metaweb database has exactly one globally unique identifier or guid. Guids are machine-readable hexadecimal numbers 32 digits (128 bits) long. They serve as universally-unique identifiers – no two objects (even objects retrieved from different Metaweb databases) will ever have the same guid. If you query the id property of an object that does not have any fully-qualified names, Metaweb will return a pseudo-id based on the object's guid. For example:

/guid/9202a8c04000641f800000000006df1b

The fully-qualified names of an object are defined by its keys. These are the /type/key values of the key property. Each key consists of an unqualified name (the value of the key) and a namespace (the namespace of the key). The namespace is another Metaweb object, and it is usually referred to by its fully-qualified name. The fully-qualified name /en/the_police, for example, has a value (or unqualified name) of "the_police" and a namespace whose id is /en. That namespace object has a key whose value is "en" and whose namespace is the special root namespace object whose id is /.

A namespace may not contain two keys with the same name – this is another way of saying that a fully-qualified name must uniquely refer to a single object. A namespace may normally define two distinct names that refer to the same object. It is possible, however, to define "unique namespaces" in which there is a one-to-one mapping between names and objects: each name refers to a single object, and each object has only a single name (in that namespace).

Working with keys can be confusing, and there is a shortcut that is sometimes useful. Some types define properties that are known as enumerations. The definition of an enumeration specifies a namespace, and the value of the property serves as a name within that namespace. Setting the value of an enumerated property of an object defines a fully-qualified name for the object.

We'll see examples of names, ids, keys and enumerations in the sub-sections that follow.

3.3.1. Names and Per-Language Uniqueness

The name property of any Metaweb object is special because it behaves like a unique property (you can safely query it with null instead of [], for example) even though it is not truly unique. Any Metaweb object can have multiple names, but may have only one name in any given language. That is, the name property is unique on a per-language basis. When you query the name of an object, Metaweb returns its name (if it has one) in your preferred language. (The desired language is specified as a parameter to the mqlread query service: see Section 4.2.4.3.)

To demonstrate the special behavior of the name property, we must choose a topic that has translations into other languages. Let's ask about the name of the object with id "/en/united_states":

Query Result
{
  "id":"/en/united_states",
  "name":null
}
{
  "id":"/en/united_states",
  "name":"United States"
}

The "en" in the namespace of the id stands for English, so it is not surprising that the English name of this object matches the id. Now let's ask for more details about the name:

Query Result
{
  "id":"/en/united_states",
  "name":{}
}
{
  "id":"/en/united_states",
  "name" : {
    "type" : "/type/text",
    "value" : "United States",
    "lang" : "/lang/en"
  }
}

This confirms that the name "United States" is an English name. Now let's ask for all names of the object:

Query Result
{
  "id":"/en/united_states",
  "name":[]
}
{
  "id":"/en/united_states",
  "name":["United States"]
}

This query just returns the unique English name in an array. So let's try again and ask for all names, along with the languages in which they are encoded:

Query Result
{
  "id":"/en/united_states",
  "name":[{}]
}
{
  "id":"/en/united_states",
  "name" : [
    {"lang":"/lang/en","type":"/type/text",
     "value":"United States"},
    {"lang":"/lang/es","type":"/type/text",
     "value":"Estados Unidos de América"},
    {"lang":"/lang/fr","type":"/type/text",
     "value":"États-Unis d'Amérique"},
    {"lang":"/lang/it","type":"/type/text",
     "value":"Stati Uniti d'America"},
    {"lang":"/lang/de","type":"/type/text",
     "value":" Vereinigte Staaten"},
  ]
}

Bingo! We find that this object has a number of names (only some of which are listed here).

Here's how we can ask for a name of the object in a specific language other than our preferred language:

Query Result
{
  "id":"/en/united_states",
  "name":{
    "value":null,
    "lang":"/lang/fr"
  }
}
{
  "id":"/en/united_states",
  "name": {
    "value": "États-Unis d'Amérique",
    "lang": "/lang/fr"
  }
}

The default preferred language (and the one used throughout this chapter) is English. We'll learn how to specify a different language in Section 4.2.4.3.

3.3.2. Ids

The most important thing about the id property is that if you specify its value, you are uniquely identifying a single object in the database. A query that specifies id need not have square brackets around it: it can never return more than one value. It is also a good rule of thumb (though not always strictly necessary) to put square brackets around any query that does not specify the id property.

It is always safe to query the id of an object with "id":null. This query will always return exactly one value (even if the object has more than one fully-qualified name). If the object does not have any fully-qualified names, Metaweb returns a pseudo-id using the object's guid and the /guid namespace. For example:

Query Result
{
  "type" : "/music/album",
  "artist": "The Police",
  "name" : "Synchronicity",
  "id" : null
}
{
  "type" : "/music/album",
  "artist" : "The Police",
  "name" : "Synchronicity",
  "id" : "/guid/9202a8c04000641f8000000002f9e349"
}

There is a one-to-one mapping between objects and guids: every object has one guid, and every valid guid must refer to a different object. Ids are less strict: no two objects can have the same id, but an object can have zero, one, or more fully-qualified names. We can refer to the Metaweb object that represents the band The Police with any of the following:

"id":"/guid/9202a8c04000641f800000000006df1b"
"id":"/en/the_police"
"id":"/wikipedia/en_id/57321"
"id":"/wikipedia/en/Police_band"

There are some restrictions on what you can do with the id property: it cannot be used as a sort key, and it cannot be used with operators such as ~= and < (We'll learn about sorting and operators later in this chapter.)

Also, you cannot query all the fully-qualified names of an object with "id":[]:

Query Result
{
  "type" : "/music/artist",
  "name" : "The Police",
  "id" : []
}
{
  "type": "/music/artist",
  "name": "The Police",
  "id": ["/en/the_police"]
}

"id":[] returns a single valid id for the object, just as "id":null does, but it wraps it in square brackets. To find multiple fully-qualified names of an object, you must query its keys, which are the topic of the next section.

3.3.3. Keys and Namespaces

To ask for multiple fully-qualified names for an object, query its key property. The following query asks for all properties of all keys of The Police.

Query Result
{
  "id":"/en/the_police",
  "key":[{}]
}
{
  "id" : "/en/the_police",
  "key" : [{
    "type" : "/type/key",
    "namespace" : "/en",
    "value" : "the_police"
  },{
    "type" : "/type/key",
    "namespace" : "/wikipedia/en_id",
    "value" : "57321"
  },{
    "type" : "/type/key",
    "namespace" : "/wikipedia/en",
    "value" : "Police_band"
  },{
    "type" : "/type/key",
    "namespace" : "/wikipedia/en",
    "value" : "The_Police_$0028band$0029"
  }]
}

The results of this query have been truncated, but there are two things worth noting about the representative keys shown here. First, fully-qualified names can contain numbers and underscores, but they cannot contain punctuation. If a local name contains punctuation such as parentheses, these must be escaped using Unicode codepoints. For example, $0028 in a fully-qualified name represents a left parenthesis and $0029 represents a right parenthesis. (See Section 2.5.9 for the full list of legal characters in fully-qualified names.)

Second, note that these keys do not include a key in the /guid namespace. /guid ids are synthesized by Metaweb when no fully-qualified name exists: they do not represent a real key.

The query above returns keys representing multiple fully-qualified names for The Police. Those results are not necessarily a complete list, however. The keys refer to namespaces, and the namespaces themselves can have more than one fully-qualified name. Consider the /en namespace:

Query Result
{
  "id":"/en",
  "key":[{}]
}
{
  "id" : "/en",
  "key" : [{
    "type" : "/type/key",
    "namespace" : "/topic",
    "value" : "en"
  },{
    "type" : "/type/key",
    "namespace" : "/",
    "value" : "en"
  }]
}

This query tells us that our namespace object has the name "en" in the root namespace /, but that it also has the name "en" in the namespace /topic. This means that the fully-qualified name /en/the_police can also be written as /topic/en/the_police.

The keys of an object all refer to namespace objects in which those keys are defined. Since Metaweb links are bi-directional, it must also be possible to query a namespace to find out what names are defined in it. Here's a query on the /topic namespace. It queries the key property we've already seen to ask for the fully-qualified names of this object. But it also queries the keys property to find out what keys are defined in /topic. (The key property is defined by /type/object. The keys property is defined by /type/namespace, and is plural to avoid naming conflicts with key.):

Query Result
{
  "type":"/type/namespace",
  "id":"/topic",
  "key":[{}],
  "keys":[{}]
}
{
  "type" : "/type/namespace",
  "id" : "/topic",
  "key" : [{
    "namespace" : "/",
    "type" : "/type/key",
    "value" : "topic"
    }],
  "keys" : [{
    "namespace" : "/en",
    "type" : "/type/key",
    "value" : "en"
  }]
}

The results tell us that the object /topic has only the one fully-qualified name we already know: the local name "topic" in the namespace /. They also tell us that there is only a single key defined in the /topic namespace. This key has the local name "en", which means that it defines the fully-qualified name /topic/en.

Both the key and keys properties return values of /type/key. The namespace properties of these keys have different meanings, however. When you query the key property of an object, the namespace property of each returned key refers to the namespace object in which the key is defined. When you query the keys property of a namespace, however, the namespace property of each returned key specifies the object whose name is defined by that key. So the query above tells us that the /topic namespace defines a key named "en" that refers to an object with id /en. That is /topic/en is another fully-qualified name for /en.

Finally, it is worth noting that any Metaweb object can serve as a namespace, even if it does not have a type property of /type/namespace. Types (such as /music/artist) have ids that use the domain (/music) as a namespace, and properties (such as /music/artist/album have ids that use the type as a namespace. At the time of this writing, [12] Metaweb domains and types very often do not have /type/namespace among their types. Despite these examples, it is not recommended practice to use an object as a namespace without typing it as a namespace.

3.3.4. Namespaces and Uniqueness

A namespace can define any number of keys, but the value property of each key must be different. Otherwise, the same fully-qualified name would be defined twice, and could refer to more than one object. On the other hand, the reverse is not necessarily true: normal namespaces can define multiple keys that refer to the same object. For example, we've already seen that the /wikipedia/en namespace defines "Police_band" and "The_Police_$0028band$0029" as names for the same object.

Not all namespaces allow multiple names for the same object, however. A namespace may be declared to be "unique", and unique namespaces enforce a one-to-one mapping between names and objects: each name can refer to only one object and each object can have only one name. The /user and /lang namespaces are unique, which means that each user and language can have only a single fully-qualified name (assuming that /user and /lang don't have any other ids themselves). Most other namespaces, such as /en are not unique. Unique namespaces have the unique property of /type/namespace set to true. Namespaces that are not unique typically have null for this property.

Query Result
{
  "id":"/lang",
  "type":"/type/namespace",
  "unique":null
}
{
  "id" : "/lang",
  "type" : "/type/namespace",
  "unique" : true
}

3.3.5. Enumerations

An enumeration is MQL's mechanism for connecting a type with the namespace that defines names for objects of that type. An enumeration takes the form of a property whose expected type is /type/enumeration. If a type has a property like this, then setting the value of that property on an object defines a key in the associated namespace. The property value is the name of the key, and the key refers to the object on which the property was set. Similarly, if you create a name in the namespace that refers to an object, then that object automatically gets a value for the property.

The type /type/lang is associated with the namespace /lang through the enumerated property iso639. (ISO639 is the name of an international standard defining short codes such as "en" and "fr" for language names.) This means that for any object of /type/lang, the value of the iso639 property is also a used to define a fully-qualified name in the the /lang namespace. The following query (and partial set of results) demonstrates:

Query Result
[{
  "type":"/type/lang",
  "name":null,
  "id":null,
  "iso639":null
}]
[{
  "type" : "/type/lang",
  "name":"English",
  "id" : "/lang/en",
  "iso639" : "en"
},{
  "type" : "/type/lang",
  "name":"German",
  "id" : "/lang/de",
  "iso639" : "de"
},{
  "type" : "/type/lang",
  "name":"Spanish",
  "id" : "/lang/es",
  "iso639" : "es"
},{
  "type" : "/type/lang",
  "name":"French",
  "id" : "/lang/fr",
  "iso639" : "fr"
}]

/type/lang defines the iso639 property so that its value becomes part of the fully-qualified name of the object. In a sense, then, this iso639 property is the "unqualified name" or "local id" of every /type/lang object. So language objects have a human-readable name, a fully-qualified, hierarchical id, and this local name, the values of which are defined (or "enumerated") by international standard ISO639.

Let's investigate the iso639 property itself:

Query Result
{
  "type":"/type/property",
  "id":"/type/lang/iso639",
  "expected_type":null,
  "enumeration":null,
  "unique":null
}
{
  "type" : "/type/property",
  "id" : "/type/lang/iso639",
  "expected_type" : "/type/enumeration",
  "enumeration" : "/lang",
  "unique" : true
}

We see from this query that the property has expected type /type/enumeration, and that it has a property named enumeration that refers to associated namespace. Since we already know that the /lang namespace is unique (no language can have more than one name in the namespace), it follows that this iso639 property must also be defined as a unique property: it cannot make sense to have more than one value for the property.

The userid property of /type/user is an important enumeration that links /type/user objects with the /user namespace. Another notable example involves taxonomy. The type /biology/organism_classification has two enumerated properties: the itis_tsn property links to the /biology/itis namespace and the ncbi_taxon_id links to the /biology/ncbi namespace. For example, the Metaweb object representing the species Equus caballus (horses) has its itis_tsn property set to "180691" and its ncbi_taxon_id set to "9796" and can be referred to as: /biology/itis/180691 or /biology/ncbi/9796. For this type, neither the enumerated properties nor the namespaces they refer to are marked unique, which means that either the ITIS or NCBI classification scheme may allow multiple names for the same species.

It should be clear from these examples that MQL enumerations are not really the same thing as the "enumerated types" (a type that has a small, pre-defined set of allowed values) that are supported by some programming languages. The name "enumeration" refers to the fact that the namespace associated with the enumerated property enumerates instances of the type. As the examples above have shown, enumerations are most useful when there is an external authority (such as an international standard) that defines the names.

Recall that the names defined in namespaces are restricted to using letters, numbers, and underscores. Any other characters must be written as a dollar sign followed by the four hexadecimal digits of its Unicode codepoint. MQL does not impose this restriction on enumerated properties. Instead, it transparently escapes and un-escapes names as needed. That is, if you set the value of an enumerated property to a string that contains a punctuation character, that character will automatically be escaped in the namespace. Similarly, if a key contains escaped punctuation but you read it through an enumerated property, the escapes will be replaced by the characters they represent. This can't be demonstrated using the /lang, /user or /biology namespaces we've mentioned here, since none of them define names that include punctuation. But we'll see an example in Chapter 5 when we define our enumerated properties for types of our own.

3.3.6. GUIDs

If you query the id of an object, Metaweb returns one of its fully-qualified names, if it has any. If it has none, Metaweb returns a synthetic identifier based on the guid of the object. Fully-qualified names and guids both uniquely identify a single object, but there is an important difference between them. Fully-qualified names are defined by keys that are separate from the object itself, and these keys are mutable. A fully-qualified name may be deleted and refer to no object at all. Or it may be modified so that it refers to an new object. If you run a query that identifies an object by its fully-qualified name, that query may (though this is unlikely) refer to a different object each time you run it, or it may fail because the fully-qualified name now refers to no object at all.

Guids, on the other hand, are intrinsic to the object and are persistent and immutable: a guid is assigned when an object is created, and it always refers to that object. If you want to be sure that a query always refers to exactly the same object, use the object's guid. If an object has a fully-qualified name, you cannot obtain its guid by querying the id property. In this case, query the guid property:

Query Result
{
  "type" : "/music/artist",
  "name": "The Police",
  "guid" : null
}
{
  "type" : "/music/artist",
  "name" : "The Police",
  "guid" : "#9202a8c04000641f800000000006df1b"
}

For historical reasons, the value returned by the guid property has a leading # character. You can use this value, with its leading hash, as the value of the guid property, or you can replace the # with /guid/ and use the modified string as the value of the id property. In either case, you'll always refer to the original object, even if the fully-qualified names that used to refer to that object now refer to other objects.

Applications that use guids tend to be more brittle and less resilient to database changes than applications that use fully-qualified names, so the use of the guid property is generally discouraged.

3.4. Property Names in MQL Queries

So far in this chapter, we've seen simple property names to the left of the colon in MQL queries. This section explores property names in more depth, explaining how to use qualified property names, property name prefixes, and property name wildcards. In addition to property names, this section also explores /type/property objects themselves. Later in this chapter, we'll learn about directives and operators. Directives are special MQL commands that appear to the left of the colon in place of a property name. And operators are punctuation that are added on to the end of the property name.

3.4.1. Simple and Qualified Property Names

Recall from the beginning of this tutorial that most objects in Metaweb have two or more types:

Query Result
{
  "id":"/en/the_police",
  "name":null,
  "type":[]
}
{
  "id" : "/en/the_police",
  "name" : "The Police",
  "type" : [
    "/music/artist",
    "/common/topic",
    "/music/producer",
    "/music/musical_group"
  ]
}

What do you do if you want to query one property, such as a list of albums from one type, and another property, such as a list of images, from a second type? MQL addresses this issue by allowing you to specify a fully-qualified property name that includes the name of the type to which it belongs. So here is how we ask for the albums, tracks and pictures by and of The Police:

1 {
2   "type":"/music/artist",
3   "name":"The Police",
4   "album":[{
5      "name":null
6      "track":[]
7   }],
8   "/common/topic/image":[{}]
9 }

Line 2 specifies that the object to be matched should be of type /music/artist. Line 3 specifies the name of the object. name and type are properties of /type/object, and are shared by all objects in the database. These property names (along with id, key, etc.) can always be used without qualification (although you can qualify them with /type/object if you want to). Other types are not allowed to define properties whose names conflict with these.

Line 4 asks for a property named album. This property is not defined by /type/object, but it is defined by /music/artist, and the query has already declared that the object will be an instance of that type, so MQL allows us to use this unqualified property name. Line 5 is like lines 2 and 3: it names a property shared by all objects. Line 6 is an interesting case. track is a property of /music/album. We can use it here without qualification because the album property to which this sub-query is attached was declared to have an "expected type" of /music/album. MQL knows this and assumes that the unqualified property name track means /music/album/track.

Finally, line 8 asks for a property named image. This is not defined by /type/object nor by /music/artist, and so we must qualify it with the name of its type so that Metaweb can understand it.

For symmetry, and to be explicit, you can rewrite the query (dropping the track portion) to fully-qualify both properties of interest:

{
  "type":"/music/artist",
  "name":"The Police",
  "/music/artist/album":[],
  "/common/topic/image":[{}]
}

If you do this, you might be tempted to drop the initial type specification, since the album property is now fully-qualified:

[{
  "name":"The Police",
  "/music/artist/album":[],
  "/common/topic/image":[{}]
}]

This is probably not the query you want, however: it returns any object whose name is The Police, even if it has no album or image properties, and even if it is an instance of neither /music/artist nor /common/topic.

In addition to querying properties from two different types, there is another reason you might need to use fully-qualified names: you might want to query the value of a property without constraining the results to members of the type that defines the property. It may seem surprising, but the Metaweb architecture allows objects to have values for any property, even if the object does not declare itself to be a member of that type. Metaweb type objects are an example: they serve as a namespace for the properties they define, (the album property of /music/artist is /music/artist/album) but are not typically typed as namespaces (the set of types for /music/artist does not include /type/namespace). It is possible to query the namespace keys of an object, even if that object is not a namespace with a query like this:

{
  "id":"/music/album",
  "/type/namespace/keys":[{}]
}

Without fully-qualified property names, we'd have to write this:

{
  "id":"/music/album",
  "type":"/type/namespace",
  "keys":[{}]
}

But this query would simply return null because the object with an id of /music/album does not have /type/namespace in its set of types.

3.4.2. Property Prefixes

Suppose we want to find the names of all bands who have an album named "Greatest Hits" AND an album named "Super Hits". We might try this query:

[{
  "type":"/music/artist",
  "name":null,
  "album":["Greatest Hits","Super Hits"]  // Invalid MQL
}]

But this is not legal MQL. And if it was, it would probably mean find an artist who has recorded exactly two albums, with names "Greatest Hits" and "Super Hits". A musical artist object may have multiple album links to album objects. We want to constrain our query so that all result objects have links to two specific album names. Here's a natural way to express this query:

[{
  "type":"/music/artist",
  "name":null,
  "album":"Greatest Hits",
  "album":"Super Hits"      // Invalid JSON
}]

This query makes sense in the Metaweb object model: find objects that have one "album" link to an album named "Greatest Hits" and another "album" link to an album named "Super Hits". Unfortunately, this query is not valid JSON: it includes the same property name twice, which means that cannot be parsed into object form. (To put this another way, you could not represent this query in a dictionary or hash data structure in a programming language like Python, Ruby or JavaScript.)

MQL's solution to this dilemma is to allow an arbitrary identifier and colon to prefix any property name. The prefix and colon are ignored: they serve simply as a workaround to the JSON limitation just described. With this trick we can rewrite the query above like this:

Query Result
[{
  "type":"/music/artist",
  "name":null,
  "a:album":"Greatest Hits",
  "b:album":"Super Hits"
}]
[{
  "type": "/music/artist",
  "name": "Alice Cooper",
  "a:album": "Greatest Hits",
  "b:album": "Super Hits"
},{
  "type": "/music/artist",
  "name": "Dan Fogelberg",
  "a:album": "Greatest Hits",
  "b:album": "Super Hits"
}]

Note that the arbitrary prefixes we choose for the query are repeated in the result objects. (The results shown here are truncated, of course.) The prefixes are arbitrary, but they must be valid identifiers which means they cannot contain punctuation characters and must not begin with a digit.

Another use of property prefixes is to constrain a property and also query the property at the same time. Here's how we can identify an object by name and type, and also ask for its full set of types:

Query Result
{
  "constraint:type":"/music/artist",
  "name":"The Police",
  "query:type":[]
}
{
  "constraint:type" : "/music/artist",
  "name" : "The Police",
  "query:type" : [
    "/music/artist",
    "/common/topic",
    "/music/producer",
    "/music/musical_group"
  ]
}

Note that although property prefixes are arbitrary, we can choose identifiers like "constraint" and "query" that add meaning to our queries. In practice, it is common to see "a", "b", and "c" used as prefixes.

Here is a more complex example that uses multiple prefixes to constrain and query properties at the same time. It asks for albums by solo artists (objects that are both /music/artist and /people/person) who have released Greatest Hits and Super Hits albums:

[{
  "type":"/music/artist",
  "also:type":"/people/person",
  "name":null,
  "album":[],
  "includes:album":"Greatest Hits",
  "and:album":"Super Hits"
}]

Suppose that for symmetry we wanted to use a prefix before both of the type constraints in the query above, so that they looked like this:

  "primary:type":"/music/artist",
  "secondary:type":"/people/person"

If we do this, the query fails with the message "Type /type/object does not have property album". If we put a prefix in front of the type property, MQL will not automatically search the specified type for properties. To make the query work with prefixed type properties, we must fully-qualify the album properties like this:

  "/music/artist/album":[],
  "includes:/music/artist/album":"Greatest Hits",
  "and:/music/artist/album":"Super Hits"

As an interesting aside, let's return to the query with which we started this section. We want to find bands that have released "Greatest Hits" and "Super Hits" albums. There is actually a way to do this without property prefixes. It relies on the fact that Metaweb relationships are always bi-directional and that MQL queries can be "turned inside out":

[{
  "type":"/music/artist",
  "name":null,
  "album":[{
    "name":"Greatest Hits",
    "artist":{
      "album":"Super Hits"
    }
  }]
}]

Translated into English, this query says: "give me the names of all bands that have released an album named "Greatest Hits", the artist of which has released an album named "Super Hits". The album property of a band object refers to an album object. And the artist property of the album object refers back to the band object. We can use this fact to further constrain the artist. This technique is worth understanding because it illustrates one of the deep properties of Metaweb objects and MQL queries.

3.4.3. Wildcards

MQL allows you to use the property name "*" as a wildcard. Consider the following query:

Query Result
{
  "id":"/guid/9202a8c04000641f8000000002f9e349",
  "*":null
}
{
  "id":"/guid/9202a8c04000641f8000000002f9e349",
  "guid" : "#9202a8c04000641f8000000002f9e349",
  "name" : "Synchronicity",
  "type" : ["/music/album", "/common/topic"],
  "key" : ["RELEASE3178", "196871"],
  "creator" : "/user/mwcl_musicbrainz",
  "permission" : "/boot/all_permission",
  "timestamp" : "2006-12-10T12:23:59.0119Z"
}

This query identifies a unique object by guid, and then uses a wildcard to ask for all of its properties. Since no type has been specified, the wildcard is expanded with all the properties of /type/object, and the result is as shown above.

Note that some of the properties expand to a single value, and others to arrays. Thus the syntax "*":null really means "*":null-or-[]. We could instead write the query using "*":[]. In this case, all of the property are returned as arrays, even unique properties.

Now let's modify the query to specify a type other than the default of /type/object:

{
  "type":"/music/album",
  "name":"Synchronicity",
  "artist":"The Police",
  "*":null
}

In this query, the * wildcard expands differently. Since we have specified that the object is of type /music/album, Metaweb looks up the properties of that type and queries each one with a null or [], depending on whether the property is unique or not. It does this in addition to also querying the common object properties shown in the query result above.

Note that if a property is explicitly listed in a query, a wildcard expansion will not overwrite it. Consider this:

{
  "type":"/music/album",
  "name":"Synchronicity",
  "artist":"The Police",
  "track":[{}],
  "*":null
}

This query explicitly asks for an array of tracks, as objects rather than just as track names. The expansion of the wildcard would normally include "track":[], but in this case that property would conflict with the explicitly specified one and will be left out of the expansion.

Metaweb will expand wildcards in sub-queries based on the type inferred from the expected type of the property. In this query, the * expands to /type/object and /music/track properties:

{
  "type":"/music/album",
  "name":"Synchronicity",
  "artist":"The Police",
  "track":[{"*":null}]
}

Now consider this query:

[{
  "type":"/music/album",
  "name":"Synchronicity",
  "artist":{
    "type":"/common/topic",
    "*":null
  }
}]

Here the artist sub-query is expected to match an object of type /music/artist. But we've explicitly specified the type /common/topic, so in this case the wildcard expands to the properties of /type/object and /common/topic. The properties of /music/artst are not included in the expansion.

Instead of writing a wildcard query as "*":null, you can also use another, more aggressive, form. "*":{} expands to query each property with {} or [{}] instead of null or []. Similarly, "*":[{}] expands to query each property, even unique properties, with [{}]. Contrast the following queries, each of which returns more information than the one before it:

// List the names of albums by The Police
{
  "id":"/en/the_police",
  "/music/artist/album":[]
}

// List the names, types and ids of albums by The Police
{
  "id":"/en/the_police",
  "/music/artist/album":[{}]
}

// List all object and album properties of all albums by The Police
{
  "id":"/en/the_police",
  "/music/artist/album":[{"*":null}]
}

// Expand all object and album properties of all albums by The Police
{
  "id":"/en/the_police",
  "/music/artist/album":[{"*":{}}]
}

A property name wildcard must be followed by null, [], {} or [{}]. This means that wildcard queries cannot be nested. We cannot take the last query above another step and write:

{
  "id":"/en/the_police",
  "/music/artist/album":[{"*":{"*":null}}]  // Illegal
}

The * property name wildcard queries the value of all properties based on the specified or inferred type of an object, and it queries those properties whether or not they have ever had a value assigned to them. MQL has another wildcard-like feature that allows us to query the value of all the properties of an object that have been defined, regardless of the type that defines those properties. This "reflective" capability is described in Section 3.7.3.

3.4.4. Inverting a Property with !

If you put an exclamation point before the name of a property, you are asking MQL to run the query using the reciprocal of the property you have named. To put this another way, a MQL property specifies a link between two objects, and the ! prefix asks MQL to follow that link backwards. The following two queries, for example, are equivalent:

// Return albums by the police
{
  "id" : "/en/the_police",
  "/music/artist/album" : []
}

// Return albums for which the artist property refers to this object
{
  "id" : "/en/the_police",
  "!/music/album/artist" : []
}

All Metaweb links are bi-directional, but not all properties have a reciprocal defined. The ! prefix is useful in these cases where you want to use a reverse property that has not been defined. The type /people/person defines a nationality property, with an expected type of /location/country, for example. But /location/country does not (or did not when this was written) define a reciprocal citizen property. So to ask Freebase for a list of citizens of Monaco, we could write:

Query Result
{
  "type" : "/location/country",
  "name" : "Monaco",
  "!/people/person/nationality" : []
}
{
  "type" : "/location/country",
  "name" : "Monaco",
  "!/people/person/nationality" : [
    "Olivier Beretta",
    "Louis Chiron",
    "Sebastien Gattuso",
    "Armand Forcherio",
    "Torben Joneleit",
    "Sophiane Baghdad",
    "Manuel Vallaurio"
  ]
}

We could, of course have written an equivalent query like this:

[{
  "type" : "/people/person",
  "name" : null,
  "nationality" : "Monaco"
}]

This query returns the same set of names, but the result is much more verbose, with type and nationality properties repeated unnecessarily for each person returned.

3.4.5. Property Objects

As explained in Section 2.6.1, the properties of a Metaweb type are represented by objects of /type/property. Let's use a wildcard to query all the properties of the album property of the /music/artist type:

Query Result
{
  "id":"/music/artist/album",
  "type":"/type/property",
  "*":null
}
{
  "id" : "/music/artist/album",
  "type" : "/type/property",
  "name" : "Albums",
  "key" : [ "album" ],
  "expected_type" : "/music/album",
  "schema" : "/music/artist",
  "unique" : null,
  "master_property" : "/music/album/artist",
  "reverse_property" : null,
  "delegated" : null,
  "enumeration" : null,
  "links" : [],
  "unit" : null
}

The sub-sections that follow explain the results of this query in detail. (For simplicity, some /type/object properties that are not relevant here have been omitted from these results).

3.4.5.1. Property Names and Keys

The property we queried above has the name "Albums" and the key album. This key is a name defined in the /music/artist namespace, giving the property the fully-qualified name /music/artist/album. It is important to understand that when we refer to a "property name" in discussions of MQL, we almost always mean its fully-qualified name, or the unqualified name defined by its key. The name "Albums" appears in the Freebase client when you explore the /music/artist type, but it is never used in MQL queries. When writing MQL, we use the key: album.

3.4.5.2. Property Type and Schema

Every /type/property object has two types associated with it. expected_type is the type that the property value is expected to have. (This is called an "expected" type because Metaweb does not enforce it. MQL makes the assumption that property values match their expected type, but in practice any property can refer to an object of any type.) The schema is the type of which the property is a part. So the /music/artist/album property we queried above has an expected_type of /music/album and a schema of /music/artist.

3.4.5.3. Unique Properties

The unique property of a property object specifies whether the property is unique. Unique properties have "unique":true:

Query Result
{
  "id":"/music/track/length",
  "/type/property/unique":null
}
{
  "id" : "/music/track/length",
  "/type/property/unique" : true
}

Non-unique properties can have the value false, but are more likely to have this property unset and return "unique":null.

3.4.5.4. Reciprocal Properties

Links between objects in the Metaweb database are inherently bidirectional. Types, like /music/artist and /music/album, that are designed to work together, often take advantage of this bi-directionality by declaring pairs of reciprocal properties. Any link between an artist and an album result is visible through he /music/artist/album property and also through its reciprocal /music/album/artist.

The reciprocity of properties is apparent through the master_property and reverse_property properties. When we queried /music/artist/album above, we learned that it has a master_property of /music/album/artist and a reverse_property of null. Let's query the reciprocal property now:

Query Result
{
  "id":"/music/album/artist",
  "type":"/type/property",
  "master_property":null,
  "reverse_property":null
}
{
  "id":"/music/album/artist",
  "type":"/type/property",
  "master_property":null,
  "reverse_property":"/music/artist/album"
}

We can determine from these results that the artist property of /music/album is the "master" property and the album property of /music/artist is the "reverse" property. These names imply more of a hierarchy than is really necessary: both properties are real, and you can usually write MQL queries without knowing whether a property is "master" or "reverse". Another way of thinking about master versus reverse properties is to assign a directionality to links. We can say that the link between an artist and an album is directed from the album to the artist. That is, it is an outgoing link from the album object and an incoming link to the artist object.

3.4.5.5. Other Properties of Properties

The enumeration property of /type/property was explained in Section 3.3.5: it refers to the namespace within which the value of the property becomes a key:

Query Result
{
  "id":"/type/lang/iso639",
  "/type/property/enumeration":null
}
{
  "id":"/type/lang/iso639",
  "/type/property/enumeration":"/lang"
}

When the expected_type of a property is a numeric value, such as /type/float or /type/int, the unit property of the property typically refers to a /type/unit object that provides an interpretation for the value. Consider the /music/track/length property:

Query Result
{
  "id":"/music/track/length",
  "type":"/type/property",
  "expected_type":null,
  "unit":null
}
{
  "id" : "/music/track/length",
  "type":"/type/property",
  "expected_type":"/type/float",
  "unit":"/en/second"
}

The links property of a property is the set of /type/link links that the property represents. We'll learn more about links in Section 3.7.

3.5. MQL Directives

Directives are special MQL commands that specify additional details about a query or request additional processing of query results. Because MQL is based on JSON, MQL directives look just like ordinary properties and values. The names of MQL directives are reserved words in MQL, however, and Metaweb does not allow types to define properties with these names.

The sections that follow document the limit, return, sort, index and optional directives. MQL also supports a link directive, which is covered in Section 3.7.

3.5.1. Limiting Results

To reduce resource consumption and bandwidth usage, Metaweb never returns more than 100 matches for a query (or for a sub-query) unless you explicitly ask for more. The Freebase database contains thousands of bands, for example, but this query only returns 100 of them:

[{
  "type":"/music/artist",
  "name":null
}]

To change the number of desired results to a larger, or a smaller, number, use the limit directive. Here, for example, is a query that returns the names of up to 2000 bands:

[{
  "type":"/music/artist",
  "name":null,
  "limit":2000
}]

Although MQL allows you to request arbitrarily large numbers of results, Metaweb does not guarantee that you'll always get an answer. Complicated queries with a large number of results may time out before Metaweb can complete the result. A better solution for large queries is to retrieve the results in batches, using a cursor. Cursors are not part of MQL: instead they are part of the mqlread service for delivering MQL queries to a Metaweb database. They are documented in Section 4.2.4.1.

Specifying a limit of 1 tells Metaweb that you're only interested in one result, and allows you to omit square brackets from your query. The following query, for example, is guaranteed not to result in a uniqueness error, even if there are two bands that have the same name:

{
  "type": "/music/artist",
  "name": "The Police",
  "id": null,
  "limit": 1
}

Specifying a limit of 0 can be useful to prune the result tree of values you aren't really interested in. The following query, for example, asks "What are the names of three bands who have recorded the song Masters of War? I'm only interested in the band name, so don't include the name of the song in the results":

Query Result
[{
  "type":"/music/artist",
  "name":null,
  "track":{
    "name":"Masters of War",
    "limit":0
  },
  "limit":3
}]
[{
  "type" : "/music/artist",
  "name" : "Kevn Kinney",
  "track" : null
},{
  "type" : "/music/artist",
  "name" : "Timesbold",
  "track" : null
},{
  "type" : "/music/artist",
  "name" : "Bob Dylan",
  "track" : null
}]

Notice that the result of the query does not include a limit property. MQL responses normally have the same structure as their query, but most directives are not considered part of this structure and are not included in responses.

Since the limit directive must appear within curly braces, limiting a query sometimes requires you to transform a simple query into a more complex one (with more complex results). Consider this query to list all albums by The Police:

{
  "type" : "/music/artist",
  "name" : "The Police",
  "album" : []
}

If we want to limit the result to five albums, we must rewrite the query as follows:

Query Result
{
  "type" : "/music/artist",
  "name" : "The Police",
  "album" : [{"name":null, "limit":5}]
}
{
  "type" : "/music/artist",
  "name" : "The Police",
  "album" : [
    {"name" : "Outlandos d'Amour"},
    {"name" : "Reggatta de Blanc"},
    {"name" : "Zenyatta Mondatta"},
    {"name" : "Ghost in the Machine"},
    {"name" : "Synchronicity"}
  ]
}

Finally, note that a limit of n means "return the first n results" not "pick n results at random". Results of Metaweb queries are unordered (unless you use the sort directive, which will be introduced shortly) so the results returned are effectively arbitrary. They are repeatable, however: re-running the same query will typically return the same n results.

3.5.2. Counting Results

MQL supports three directives for counting or estimating the number of matches for a query. This section explains the return, count, and estimate-count directives.

Sometimes you don't care what the results of a query are: you just want to know how many results there are. To find out, use the return directive. Here's how we'd ask how many albums The Police have released:

Query Result
{
  "type" : "/music/artist",
  "name" : "The Police",
  "album" : { "return":"count" }
}
{
  "type" : "/music/artist",
  "name" : "The Police",
  "album" : 22
}

The value of the return directive is "count". Notice that this directive goes inside curly braces in the query, but the count is returned as a plain integer value. If we were interested in the names of the albums, we'd obviously have to use square brackets in the query so that Metaweb could return an array of results to us. In this case, the square brackets are unnecessary. (Though if we'd used them, Metaweb would return [22].)

If return:count appears at the top-level of a query, then Metaweb returns just the count:

Query Result
{
  "type" : "/music/album",
  "artist" : "The Police",
  "return":"count"
}
22

Notice that in this case, the presence of the return directive causes the result to have a completely different JSON structure than the query does.

If a top-level query finds no matches, it returns a count of 0. Let's ask how many albums named "Arrested" have been released by The Police:

Query Result
{
  "type" : "/music/album",
  "artist" : "The Police",
  "name" : "Arrested",
  "return":"count"
}
0

Note, however, that we get a different result when we ask the question this way:

Query Result
{
  "type" : "/music/artist",
  "name" : "The Police",
  "album" : {
    "name": "Arrested",
    "return":"count"
  }
}
null

The result of this query without the return:count directive is null: it does not match anything. So there is nothing to count, and the query returns null even with the directive. Adding an optional directive to the sub-query solves this problem. We'll learn about the optional directive in Section 3.5.5.

If your query is complex, or if there are many results, the request may timeout before Metaweb can count the exact number of matches. If timeouts are a concern, use return:estimate-count instead of return:count. As its name implies, this version of the directive returns an estimate of the number of matches rather than an exact count. The following query, for example, asks for the approximate number of musical artists in the database:

Query Result
{
  "type" : "/music/artist",
  "return":"estimate-count"
}
354200

With a dataset this large, asking for an exact count with return:count is likely to result in a timeout.

In some circumstances, you may want to ask for the results of a query (up to the explicit limit you specify or up to the implicit limit of 100 results) and also ask for a count or estimate of the total number of results available. You might want to do this, for example, if you were displaying the results in paged form and wanted to include a message of the form "Results 1-20 of 317" on the first page. (The trick to retrieving subsequent pages of results is to use a cursor. This is explained in Section 4.2.4.1).

If you want to retrieve results and a count, use the count:null directive or the estimate-count:null directive instead of the return directive. For example, to ask for the names of tracks by The Police, and a count of the total number of tracks, you might use this query:

Query Result
[{
  "type" : "/music/track",
  "artist" : "The Police",
  "name" : null,
  "count" : null
}]
[{
  "type" : "/music/track",
  "artist" : "The Police",
  "name" : "Can't Stand Losing You",
  "count" : 133
},{
  "type" : "/music/track",
  "artist" : "The Police",
  "name" : "Message in a Bottle",
  "count" : 133
},
// 98 more results omitted...
]

This query returns 100 results, and the count property tells us that more results are available. Notice that the count property appears over and over again in each of the results. (Just as the type and artist properties do).

We could also have used estimate-count:null in the query. In that case, each of the 100 results would have included estimate-count:135.

3.5.3. Sorting Results

Use the sort directive if you'd like the Metaweb server to sort the results of your query before returning them. For example, to ask for the names of the tracks on an album in alphabetical order, sort them by name:

// Tracks on the album Synchronicity, in alphabetical order
{
  "type":"/music/album",
  "name":"Synchronicity",
  "artist":"The Police",
  "track": [{
    "name":null,
    "sort":"name"
  }]
}

As you can see, the sort directive simply specifies the name of the property by which the sort is to be done. To order these same tracks from shortest to longest, use "length" as the sort key:

// Tracks on the album Synchronicity, from shortest to longest
{
  "type":"/music/album",
  "name":"Synchronicity",
  "artist":"The Police",
  "track": [{
    "name":null,
    "length":null,
    "sort":"length"
  }]
}

Note that the query above includes "length":null. If you want to use a property as a sort key, you must query that property.

To reverse this order, precede the name of the sort key by a minus sign:

// Tracks on the album Synchronicity, from longest to shortest
{
  "type":"/music/album",
  "name":"Synchronicity",
  "artist":"The Police",
  "track": [{
    "name":null,
    "length":null,
    "sort":"-length"
  }]
}

The sorts shown above are convenient, but could easily be duplicated on the client side. That is, you could request unordered results from Metaweb and sort them yourself. One situation in which the sort directive cannot be duplicated on the client is when it interacts with the limit directive. Result sets are truncated to the specified limit after the sort is applied. Use sort and limit together in queries like this:

// What is the longest track on Synchronicity?
{
  "type":"/music/album",
  "name":"Synchronicity",
  "artist":"The Police",
  "track": {
    "name":null,
    "length":null,
    "sort":"-length",
    "limit":1
  }
}

Sorting need not be limited to a single sort key. To specify more than one key, use an array on the right-hand side of the sort directive:

// List tracks by The Police, sorted from shortest to longest.
// Tracks of the same length should be in alphabetical order.
[{
  "type":"/music/track",
  "artist":"The Police",
  "name":null,
  "length":null,
  "sort":["length","name"]
}]

If your query includes sub-queries, then the properties of those sub-queries can also be used as sort keys. The query below uses this kind of hierarchically-named sort key. Note also that it has two distinct sort clauses.

// List all albums by The Police, along with the name of their longest track.
// Order the albums from longest longest track to shortest longest track.
[{
  "type":"/music/album",
  "artist":"The Police",
  "name":null,
  "track":{
    "name":null,
    "length":null,
    "sort":"-length",
    "limit":1
  },
  "sort":"-track.length"
}]

Finally, here is a complex example that uses multiple sort keys, hierarchically-named sort keys, and a sort key that includes a fully-qualified property name (see Section 3.4.1):

// Return a list of performances (character/actor pairs) in George Lucas films.
// Sort them by character name. If the character appears in more than
// one film, sort by film name. If more than one actor portrays the character
// in a single film, sort them by actor birthdate (most to least recent).
[{
  "type":"/film/performance",
  "film":{
    "name":null,
    "directed_by":"George Lucas"
  },
  "character":null,
  "actor":{
     "name":null,
     "/people/person/date_of_birth":null
  },
  "sort":["character",
          "film.name",
          "-actor./people/person/date_of_birth"]        
}]

3.5.4. Ordered Collections

If you do not include a sort directive in a query then Metaweb makes not guarantees about the order in which results are returned. If you don't ask for the data to be sorted, you should treat the result as an unordered set of values rather than an ordered list. [13]

Some data, such as the tracks on an album, have a natural order: the order in which they are arranged in the album. If you want results to be sorted according to this natural ordering, use "sort":"index". (Or, to reverse the natural ordering, use "sort":"-index".

// Return the tracks on the album Synchronicity in the order that they appear
{
  "type":"/music/album",
  "artist":"The Police",
  "name":"Synchronicity",
  "track":[{
    "name":null,
    "index":null,
    "sort":"index"
  }]
}

Since we've used "index" as a sort key, we must query the value of "index" as well. index is a MQL directive that looks like a property, because it is queried with null, and because, unlike other directives, it is included in query results. When you include the "index":null directive in a query, the results include index properties whose values are the integers between 0 and n-1, where n is the number of ordered results. It is important to understand that indexes do not apply to objects in Metaweb, but to the relationships between objects. It is the link between the album "Synchronicity" and the track "Mother" that has an index of 3 (because it is the fourth song on the album), not the track itself. This becomes clear when you consider the case of a track that appears on more than one album: if "Mother" also appears on an album named "Greatest Hits" it is likely to have a different index on that album.

index is a directive, not a property. MQL read queries may use index as a sort key, and they may query the index with "index":null, but may not use the keyword in any other way. You cannot write "index":1 to ask for the second item in a set, for example. Similarly, index cannot be used with operators such as < to select indexes less than a given value. (We'll learn about < and other operators in Section 3.6.) The index directive can be used in other ways in write queries, however, and we'll learn about that in Chapter 5.

The index of a track on an album is an important and useful detail. And because a single track can appear on more than one album, it is not possible to capture this ordering with a property on the track object. It is for this reason that the index directive is useful here. But this is a somewhat unusual case, however, and links in Freebase do not typically have an order associated with them. Consider the set of albums by The Police instead of the set of tracks on Synchronicity. The most natural order for albums by a band is probably by release date. But this is captured by the release_date property of the album object, and there is no need to sort on index to obtain a chronological list of albums.

While most Freebase data is unordered, bear in mind that some is partially ordered. Any time you use the "index":null directive, you should be prepared for returned indexes of null to be returned along with the numbered indexes. The /film/film and /film/performance types provide an example. The starring property of a film links to performance objects that specify the cast (the actors and the characters they portray) of the film. Indexes are sometimes used to capture the billing order of the top stars in the movie, while minor performances are left unordered.

The index directive can be used in conjunction with the sort and limit directives. Consider the following query, which ask for the top two stars in the movie Psycho:

Query Result
[{
  "type" : "/film/film",
  "name" : "Psycho",
  "directed_by":"Alfred Hitchcock",
  "starring" : [{
    "actor" : null,
    "character" : null,
    "index" : null,
    "sort" : "index",
    "limit" : 2
  }]
}]
[{
  "type" : "/film/film",
  "name" : "Psycho",
  "directed_by" : "Alfred Hitchcock",
  "starring" : [{
    "actor" : "Anthony Perkins",
    "character" : "Norman Bates",
    "index" : 0
  },{
    "actor" : "Janet Leigh",
    "character" : "Marion Crane",
    "index" : 1
  }]
}]

3.5.4.1. Indexes are Relative

The query above correctly returns the names of the final two tracks on the album Synchronicity. Look carefully, however at the index values it returns: the last track is given an index of 1 and the penultimate track an index of 0. This is not a bug: this query simply reveals the true nature of ordered collections in Metaweb. Metaweb does not include an absolute index for each link. The implementation is able to say whether any link is greater-than or less-than another, but it cannot tell you the absolute position of that link within the complete set of links.

The number that Metaweb returns as the value of the index property is a synthetic one, generated by Metaweb as a simple way to express the order of elements. If Metaweb returns an array holding n indexed elements, then it generates index values for those elements that range from 0 to n-1. (There may be additional elements in the array that have an index of null, however.) For example, if you ask for the last two tracks on an album, the resulting values have indexes 0 and 1. If you ask for tracks that are shorter than 2 minutes (we'll learn how to do this in Section 3.6) and Metaweb finds three of them, then it will assign them index values of 0, 1, and 2. If you want to know the track number for the tracks on a particular album, you must query the complete set of tracks. Then add one to the index value to get the track number. If you want to know the track numbers of the short songs, you must query the complete set of tracks, and search for the short songs yourself.

3.5.5. Optional Matches

In addition to the limit, return, sort and index directives, MQL also includes an optional directive. If part of your query is not required to match, add "optional":"optional" to it. For example, we can use the optional directive to ask the question: "What bands have recorded the song "Masters of War", and do they have a Greatest Hits album?". The query looks like this:

Query Result
[{
  "type" : "/music/artist",
  "name" : null,
  "track" : "Masters of War",
  "album" : {
    "name" : "Greatest Hits",
    "optional" : "optional"
  }
}]
[{
  "type" : "/music/artist",
  "name" : "Kevn Kinney",
  "track" : "Masters of War",
  "album" : null
},
{
  "type" : "/music/artist",
  "name" : "Timesbold",
  "track" : "Masters of War",
  "album" : null
},
{
  "type" : "/music/artist",
  "name" : "Bob Dylan",
  "track" : "Masters of War",
  "album" : { "name" : "Greatest Hits" }
}]

Without the optional directive, the query would only return bands whose have recorded Masters of War and have released a Greatest Hits album. With the optional directive, we get all bands who have recorded the song, and additionally, we find out whether or not they have released a Greatest Hits album.

Optional queries can be nested inside optional queries. The following query is an extension to the one above. It further asks whether "Masters of War" appears on the Greatest Hits album. Without the nested optional directive, only Greatest Hits albums that include the song would be matched.

[{
  "type" : "/music/artist",
  "name" : null,
  "track" : "Masters of War",
  "album" : {
    "name" : "Greatest Hits",
    "optional" : "optional",
    "track" : {
      "name" : "Masters of War",
      "optional" : "optional"
    }
  },
}]

MQL allows "optional":true instead of "optional":"optional". You can also write "optional":"required" or "optional":false to indicate that a match is required, but this is the default and is never necessary or useful.

3.5.5.1. When to use the optional Directive

In order to understand when to use the optional directive, it is important to understand when a sub-query requires a match and when it does not. Suppose we want to ask for a list of artists who have recorded "Masters of War" and for the nicknames of those artists, if they have any:

[{
  "type" : "/music/artist",
  "track" : "Masters of War",
  "name" : null,
  "/common/topic/alias" : []
}]

This request for aliases is implicitly optional: if an artist has no nicknames, [] will be returned. The same is true if we write "/common/topic/alias":[{}]. If none of the matching artists has more than one nickname, then we could even write "/common/topic/alias":null. In that case, each result would include either the single nickname or null. These queries simply ask for aliases, they do not constrain the query to match only artists that do have nicknames. Suppose, however that we're only interested in English nicknames:

[{
  "type" : "/music/artist",
  "track" : "Masters of War",
  "name" : null,
  "/common/topic/alias" : [{"value":null, "lang":"/lang/en"}]
}]

We've now introduced a constraint into the sub-query, and only artists who have at least one English-language alias will match. Even though this alias sub-query is in square brackets, it still requires at least one match unless we add an optional directive:

[{
  "type" : "/music/artist",
  "track" : "Masters of War",
  "name" : null,
  "/common/topic/alias" : [{
    "optional":true,
    "value":null,
    "lang":"/lang/en"
  }]
}]

The queries [] and [{}] do not require a match, but putting any property inside the curly braces transforms the sub-query into one that is required. This is true even if that property is a query rather than a constraint. So while "/common/topic/alias":[] will match objects that have no aliases, the same is not true of this query:

"/common/topic/alias":[{"value":null}]

By explicitly stating our request for the value of each alias, we've transformed the query from one that is implicitly optional to one that requires at least one match.

3.5.5.2. Optional and return:count

Queries that use return:count can add optional:true so that they can return a count of zero instead of returning null to indicate a failure to find a match:

Query Result
{
  "type" : "/music/artist",
  "name" : "The Police",
  "album" : {
    "name": "Arrested",
    "return":"count",
    "optional":true
  }
}
{
  "type" : "/music/artist",
  "name" : "The Police",
  "album" : 0
}

Without the optional directive, this query would return null because no match would be found at all. Note that using optional is only necessary in sub-queries: return:count at the toplevel returns 0 when no match is found.

3.5.6. Forbidden Matches

The optional directive can be used in another important way. If we include "optional":"forbidden" in a sub-query, then the results will not include any values for which the sub-query matches. Here, for example, is how we find bands that have recorded "Masters of War" and have not released a Greatest Hits album:

Query Result
[{
  "type" : "/music/artist",
  "name" : null,
  "track" : "Masters of War",
  "album" : {
    "name" : "Greatest Hits",
    "optional" : "forbidden"
  },
}]
[{
  "type" : "/music/artist",
  "name" : "Kevn Kinney",
  "track" : "Masters of War",
  "album" : null
},
{
  "type" : "/music/artist",
  "name" : "Timesbold",
  "track" : "Masters of War",
  "album" : null
}]

Note that if a sub-query includes "optional":"forbidden", the response to that sub-query will always be null. Note also that there is no boolean alternative to the string "forbidden" – a value of false means "required", not "forbidden".

A query can include multiple forbidden sub-queries. This query, for example, finds bands that have not released albums named "Greatest Hits" or "The Best Of":

[{
  "type":"/music/artist",
  "name":null,
  "neither:album":{"optional":"forbidden", "name":"Greatest Hits"},
  "nor:album":{"optional":"forbidden", "name":"The Best Of"}
}]

When we use null in a MQL query, we're making a request for the value of a property, rather than constraining that property to have the value null. You can use optional:forbidden to write queries that constrain a property to be null. Suppose we wanted to ask Freebase for a list of bands that have no known albums (note that an optional:forbidden clause must always have another clause present, so "id":null has been included in the query):

[{
  "type" : "/music/artist",
  "name" : null,
  "album" : { "optional" : "forbidden",
  "id" : null,
  }
}]

MQL also supports a != operator, which excludes results from a query in a different way. We'll learn about != in

3.6. MQL Operators

A property in a MQL query that is not itself a query (i.e. one whose value is not null or []) expresses a constraint. A normal property/value pair constrains results to objects for which that property is equal to that value. But "equal to" is not the only constraint we can express in MQL. In this section we introduce operators such as <, ~=, != and |= which express constraints "less than", "matches", "not equal to", and "one of".

If MQL was not based on JSON, these operators could naturally appear between a property name and its value. Since MQL uses JSON, however, operators are instead appended to a property name and appear inside the quotation marks. To express a "less than" constraint, for example, we might write:

"date_of_birth<" : "2000"

When a property name is followed by an operator, the value that follows must be a JSON literal (or, for the |= operator an JSON array): MQL sub-queries in curly braces are not allowed with operators. Finally, note that properties that include operators are like MQL directives and do not appear in the results of the query.

3.6.1. Order Constraints

We know how to ask "what are the names and lengths of the tracks on the album Synchronicity by The Police?". The query looks like this:

{
  "type":"/music/album",
  "name":"Synchronicity",
  "artist":"The Police",
  "track":[{"name":null, "length":null}]
}

Metaweb also allows us to ask "What are the names and lengths of the long songs on the album?" The query below includes a numeric constraint on the length property, and the freebase.com response only includes the two songs on the album that are longer than 300 seconds:

Query Result
{
  "type":"/music/album",
  "name":"Synchronicity",
  "artist":"The Police",
  "track":[{
    "name":null,
    "length":null,
    "length>":300
  }]
}
{
  "type" : "/music/album",
  "name" : "Synchronicity",
  "artist" : "The Police",
  "track" : [{
    "name" : "Synchronicity II",
    "length" : 305.066
  }, {
    "name" : "Wrapped Around Your Finger",
    "length" : 313.733
  }]
}

The line "length>":300 in the query expresses a constraint to Metaweb: it specifies that the track must be longer than 300 seconds. In addition to >, you can also use < for less-than, and <= and >= for less-than-or-equal and greater-than-or-equal. Note, however, that no spaces are allowed before or after these punctuation characters.

Note that constraining the length property with > does not automatically query the property. We must include a separate "length":null line in our query if we want to know the exact length of the returned tracks. Note that according to JSON syntax length> is a different property than length, so there is no need to use a property prefix to distinguish the constraint from the query.

You can include more than one numeric constraint on the same property, restricting the value to a range. Here's how we ask for songs that are at least three minutes long, but less than four:

{
  "type":"/music/album",
  "name":"Synchronicity",
  "artist":"The Police",
  "track":[{
    "name":null,
    "length>=":180,
    "length<":240
  }]
}

Note that this query constrains the value of the length property, but does not ask Metaweb to return the exact value of the property.

Numbers are not the only type that can be constrained with these operators. Here, for example, is a query that constrains a /type/datetime property to obtain a list of albums released in January 1999:

[{
   "type":"/music/album",
   "name":null,
   "artist":null,
   "release_date>=":"1999-01-01",
   "release_date<=":"1999-01-31"
}]

When the <, >, <=, and >= operators are used with strings, they compare in case-insensitive, Unicode-aware alphabetical order. For example, to find bands whose name begins with the letter A or B, use this query:

[{
  "type" : "/music/artist",
  "name" : null,
  "name>=" : "A",
  "name<" : "C"
}]

It is not legal to use any of these order operators with the id (or guid) property. Fully-qualified names and guids cannot be compared alphabetically or numerically.

3.6.2. Pattern Matching with the ~= Operator

The MQL pattern matching operator ~= tests a property to see if it contains a specified word or phrase. [14] To try this out, let's find some short songs about love:

[{
   "type":"/music/track",
   "artist":null,
   "name":null,
   "name~=":"love",
   "length":null,
   "length<":120
}]

Here's a query for songs about love recorded by bands whose name begins with "The":

[{
   "type":"/music/track",
   "artist":null,
   "artist~=":"^The",
   "name":null,
   "name~=":"love"
}]

Results include Love Shack by The B-52's and For Your Love by The Yardbirds. Notice that the constraint on the artist property in the query above uses the ^ character to specify that the word The must appear at the beginning of the artist's name. (This is like the anchor syntax used in regular expressions, but note that MQL patterns are not nearly as general as regular expressions.)

Here's a query to find all bands whose name is two words long and begins with the word The (such as The Police and The Clash).

[{
  "type" : "/music/artist",
  "name" : null,
  "name~=" : "^The *$"
}]

This query is interesting in several ways. First, it uses ^ again to anchor the match to the beginning of the string. And it uses $ to anchor to the end of the string. The * character matches any string of characters (other than spaces).

Table 3.2 summarizes MQL pattern matching syntax.

Table 3.2. MQL Pattern Matching Syntax

Pattern Matches
love

Matches any string that contains the word "love". Does not match strings containing "glove" or "lover".

love you

Matches any string that contains the exact phrase "love you", such as "Hello, I Love You" but not "All You Need is Love".

love*

Matches any string containing a word that begins with "love", such as "love", "lover" or "lovely". Does not match "glove".

*love

Matches any string containing a word that ends with "love", such as "love" or "glove".

*love*

Matches any string that contains "love", such as "love", "glove", "lover" and "glover".

^

Matches the beginning of a string. For example, ^the matches any string that begins with the word "the", and ^the* matches any string that begins with a word that begins with "the", such as "they" or "there".

$

Matches the end of a string. For example, hits$ matches any string that ends with the word "hits", and *love$ matches "Sunshine of your Love" and "Smell the Glove".

*

Matches a single word. ^*$, matches any single-word string, for example, and I * you matches any string that contains a 3-word phrase beginning with "I" and ending with "you".

-

A hyphen or other punctuation matches an optional space. For example, bi-directional matches "bi directional", "bi-directional", or "bidirectional".

\

Use a backslash to escape any punctuation character that you want to match literally. bi\-directional matches any string that contains the hyphenated word "bi-directional", for example. Note, however, that JSON string literals require backslashes like this to be doubled. If you type a JSON query "by hand" or use string manipulation techniques to create a query, be sure to double the backslashes. If you use a JSON serializer to create the query, it should double the backslashes for you.

numbers

When the pattern to be matched looks like a number, any numbers in the text that is being matched are first converted to normalized form. This means that leading zeros are removed, trailing zeros after the decimal point are removed, a zero is added before the decimal point if there is no digit there, and so on. If the match against the normalized text does not succeed, it is tried again with the numbers in their original, unnormalized form. This means that the pattern "7" matches "Agent 007", "July 07, 2008", and "7.0". But the pattern "007" does not match "7", "07", or "7.0".


Here's another example: what bands have three-word names that begin with "the" and end with a plural (e.g. The Beach Boys, The Doobie Brothers)?

[{
  "type" : "/music/artist",
  "name" : null,
  "name~=" : "^The * *s$"
}]

Note that when a pattern includes multiple words (or even a word and an asterisk), Metaweb doesn't just attempt to match each word individually: it looks for a matching phrase. The pattern "I love" only matches strings that contain those two words in that order. It does not match "I want your love", for example. If you want to match any string that contains the words "I" and "love", regardless of order, you should use two separate properties:

[{
   "type":"/music/track",
   "name":null,
   "a:name~=":"I",
   "b:name~=":"love"
}]

Here are two final notes about MQL pattern matching. First, all searches are case-insensitive. Second, it is not legal to perform pattern matching on the id (or guid) property.

3.6.3. The "one of" Operator |=

MQL uses the |= operator to restrict the value of a property to a set of possible values, which are expressed as a JSON array of JSON literals. The constraint says "match any one of the values in this array". Here's how we can find a list of bands who have recorded an album named "Greatest Hits" or an album named "Super Hits":

[{
  "type":"/music/artist",
  "name":null,
  "album|=":["Greatest Hits","Super Hits"],
  "album":[]
}]

The album property in the response will include one or more album names, but the names will all be either "Greatest Hits" or "Super Hits".

The values in the array can be numbers instead of strings. Here's how we look up the names of the first three chemical elements:

Query Result
[{
  "type":"/chemistry/chemical_element",
  "name":null,
  "atomic_number|=":[1,2,3],
  "atomic_number":null,
  "sort":"atomic_number"
}]
[{
  "type" : "/chemistry/chemical_element",
  "name" : "Hydrogen",
  "atomic_number" : 1
},{
  "type" : "/chemistry/chemical_element",
  "name" : "Helium",
  "atomic_number" : 2
},{
  "type" : "/chemistry/chemical_element",
  "name" : "Lithium",
  "atomic_number" : 3
}]

Unlike the order and pattern-matching operators, the |= operator can be used on the id property, and this is a useful way to run the same query over multiple objects specified by id. The following query asks for the properties of three types, for example:

[{
  "id|=":["/type/type", "/type/property", "/type/key"],
  "id":null,
  "/type/type/properties":[]
}]

Finally, here is an example that uses the |= constraint in two different places. It asks for the French and Spanish translations of the countries named "England" and "France"

Query Result
[{
  "type":"/location/country",
  "english:name|=": ["England",
                     "France"],
  "english:name": null,
  "foreign:name": [{
    "value":null,
    "lang":null,
    "lang|=":["/lang/fr",
              "/lang/es"]
  }]
}]
[{
  "type" : "/location/country",
  "english:name" : "England",
  "foreign:name" : [
    {"lang" : "/lang/fr", "value" : "Angleterre"},
    {"lang" : "/lang/es", "value" : "Inglaterra"}
  ]
},{
  "type" : "/location/country",
  "english:name" : "France",
  "foreign:name" : [
    {"lang" : "/lang/fr",  "value" : "France"},
    {"lang" : "/lang/es",  "value" : "Francia"}
  ]
}]

Most MQL operators expect a single JSON literal as their value. The |= operator instead expects a JSON array of JSON literals. Only literals are allowed in the array: the following query, for example, is not legal:

[{
  "type":"/music/artist",
  "name":null,
  "album|=":[
    {"name":"Greatest Hits", "lang":"/lang/en"},   // Invalid MQL!
    {"name":"Super Hits", "lang":"/lang/en"}
  ],
  "album":[]
}]

3.6.4. The "but not" Operator !=

The != operator says that the constrained property can be anything but the specified value. (It does require that the property be something, however: it does not match object for which the property is null.) Here, for example is how we list albums by The Police, but not Greatest Hits:

{
  "type":"/music/artist",
  "name":"The Police",
  "album":[{
    "name":null,
    "name!=":"Greatest Hits"
  }]
}

And here we use != to list the names of chemical elements other than elements 1, 2, and 3:

[{
  "type":"/chemistry/chemical_element",
  "name":null,
  "atomic_number!=":1,
  "a:atomic_number!=":2,
  "b:atomic_number!=":3
}]

The != operator is like the optional:forbidden directive in that it excludes values from the results. The operator and the directive are quite different, however, and it is important to understand when to use each. Contrast the album query above that uses != with the following query, which uses optional:forbidden to say "list albums by The Police, excluding any that contain the song Roxanne":

{
  "type":"/music/artist",
  "name":"The Police",
  "album":[{
    "name":null,
    "track":{
      "name":"Roxanne",
      "optional":"forbidden"
    }
  }]
}

The != operator excludes a single JSON literal and is useful with unique properties, like atomic_number and name (which behaves like a unique property even though it technically isn't). The optional:forbidden directive, on the other hand, excludes any match of the sub-query that contains it. It works with non-unique properties and expresses the idea may not include.

The query with which we began this section can be simplified as follows:

{
  "type":"/music/artist",
  "name":"The Police",
  "album":[],
  "album!=":"Greatest Hits"
}

This makes the query more difficult to understand, however. It appears as if the != operator is constraining the non-unique property album. In fact, however, it is expressing a constraint on the default name property of any albums.

It is important to understand that when you use != on a property you are implicitly constraining the results to objects for which that property exists and has a value different than the one you specify. Even though a property value of null is different than the one you specify, it is not matched. Consider the following three queries, for example:

// How many albums have The Police released?
{
  "return":"count",
  "type":"/music/album",
  "artist":"The Police",
}
// How many live albums have they released?
{
  "return":"count",
  "type":"/music/album",
  "artist":"The Police",
  "release_type":"Live Album"
}
// How many non-live albums have they released?
{
  "return":"count",
  "type":"/music/album",
  "artist":"The Police",
  "release_type!=":"Live Album"
}

The first query returns a count of 22, and you might expect that since every album is either a live not live the sum of the counts of the second and third queries would be 22. But, in fact, the second query returns 3 and the third returns 8. There are 8 Police albums that have a defined release_type whose value is something other than "Live Album". The other 11 Police albums do not have a release_type defined in Freebase (or did not when this was written). It turns out that optional:forbidden is what we want here. This query returns the 19 albums we expected:

{
  "return":"count",
  "type":"/music/album",
  "artist":"The Police",
  "release_type" : {"optional":"forbidden", "name":"Live Album"}
}

3.6.5. Expressing AND, OR, and NOT in MQL

We conclude this section on MQL operators with a discussion of Boolean operations in MQL queries. Boolean AND is the default operation in MQL: each of the properties in a MQL query expresses a constraint, and these constraints are implicitly ANDed together. Consider:

[{
  "type":"/music/artist",
  "name":null,
  "name~=":"^The",
  "album":"Greatest Hits"
}]

This query says: tell me the names of objects which have type "/music/artist" AND which have a name that begins with "The" AND which have an album named "Greatest Hits".

The ability to use property prefixes in MQL allows us to express an AND on the value of a single property, as in the following query which asks "What are the names of objects that are both musical artists AND people AND have recorded a Christmas album?":

[{
  "name":null,
  "type":"/music/artist",
  "and:type":"/people/person",
  "album~=":"Christmas"
}]

The |= operator was introduced as the "one of" operator, but note that we could also have called it the "OR" operator, as in the following query, which asks "Return the names of albums recorded by The Police OR Sting"

[{
  "type":"/music/album",
  "name":null,
  "artist|=":["The Police","Sting"]
}]

Note that |= is a specialized operator, and expresses OR in a much less general way than the implicit AND of MQL. First of all, |= applies to only one property: it is not possible to request a list of albums released before 1990 or with genre "alternative rock". To obtain such a list, you must simply make two queries and combine the results. (It is possible to send two distinct queries in a single HTTP request. We'll learn how to do this in Chapter 4.)

MQL provides two distinct ways to express a Boolean NOT in a query. The != operator says that a property must be anything but the specified value. The optional:forbidden directive instead specifies that the set of values for a property may not include an object that matches the sub-query of which it is a part. The distinction is subtle and it is important to understand when to each form of NOT, so we'll repeat some previously-seen examples.

First, here is how you can use != to say "list the albums by The Police but NOT Greatest Hits":

{
  "type":"/music/artist",
  "name":"The Police",
  "album":[{
    "name":null,
    "name!=":"Greatest Hits"
  }]
}

Contrast that with the following query which asks for a list of bands that do NOT have a Greatest Hits album:

[{
  "type":"/music/artist",
  "name":null,
  "album":{
    "optional":"forbidden",
    "name":"Greatest Hits"
  }
}]

Let's conclude with an example that combines AND, OR, and NOT into a single query:

[{
  "type" : "/music/album",
  "name" : null,
  "name!=" : "Greatest Hits",
  "release_date|=" : ["1978","1979"],
  "a:genre" : "New Wave",
  "b:genre|=" : ["Punk Rock", "Post-punk"]
  "artist" : {
    "name" : null,
    "type" : {
      "id" : "/people/person",
      "optional" : "forbidden"
    }
  },
}]

This query asks for the names of albums which:

  • are NOT named "Greatest Hits" AND

  • were released in 1978 OR 1979 AND

  • have a genre of "New Wave" AND

  • also have a genre of either "Punk Rock" OR "Post-punk" AND

  • were recorded by an musical artist who is NOT a person (i.e. by a band and not a solo artist).

Results include "Outlandos d'Amour" by The Police and "Go 2" by XTC.

3.7. Links, Reflection and History

This section of the chapter covers three related advanced features of MQL. The link directive and the type /type/link enable us to write MQL queries that return details about the links between objects rather than about the objects themselves. The first related feature is reflection. Reflection allows us to ask for all links to or from an object. It is something like MQL wildcards, but is link-based rather than type-based. The second advanced feature related to links is history. Metaweb databases are journal-based and retain a record of every link ever made, even after those links are deleted or replaced. It is possible, therefore, to use properties of /type/link to query the modification history of any Metaweb object.

3.7.1. Links to Sub-queries

Here's a query that we come back to time and again in this chapter. It asks for details about the album Synchronicity by The Police:

{
  "id" : "/en/the_police",
  "/music/artist/album" : {
    "name" : "Synchronicity",
    "track":[]
  }
}

Instead of asking about the tracks on the album, let's now ask for details on the link between the band and the album:

Query Result
{
  "id" : "/en/the_police",
  "/music/artist/album" : {
    "name" : "Synchronicity",
    "link" : {}
  }
}
{
  "id" : "/en/the_police",
  "/music/artist/album" : {
    "name" : "Synchronicity",
    "link" : {
      "type" : "/type/link",
      "master_property" : "/music/album/artist",
      "reverse" : true
    }
  }
}

There are many points to note about this query and its results. We'll start by saying that link is a MQL directive, not an object property. A link directive requests information about the link between the object (the album in this case) that matches the query that the directive appears in and the object (the band) that matches the parent query. The link directive can only appear in a nested query – it makes no sense in a toplevel query. (Toplevel link queries require a different syntax, explained below.)

When we query a Metaweb object with {}, we get back the name, type, and id of the object. When we query a primitive with {}, we get back the type and value of the primitive (and also the lang and namespace properties of /type/text and /type/key). The results of the query above demonstrate that links are neither objects nor primitive values: they are something completely different. A link represents a relationship between two objects, it is not an object itself. Nor is it a primitive value: it carries too much information to be a simple primitive. Links do not have names, ids, or guids. Metaweb reports the type of a link as /type/link, but this is a synthetic type (like /type/object) that simply serves as a collection of properties: actual Metaweb objects are never assigned this type.

In addition to noting the absence of the name and id properties in the link results above, let's consider the master_property and reverse properties. The master_property property of a link identifies the fully-qualified name of the master property that connects the two objects. In this case we learn than the band The Police and the album Synchronicity are connected through the /music/album/artist property. The reverse property of the link tells us whether the link was followed forward or "in reverse". In this case, reverse is true, because we started with the band and followed the property /music/artist/album to the album. Recall that /music/artist/album is the reciprocal of /music/album/artist.

Links are different from objects and primitives in another way, too. The default property of a link is not name, id, or value. Instead, when we query a link using "link":null, it is the master_property of the link that is returned:

Query Result
{
  "id" : "/en/the_police",
  "/music/artist/album" : {
    "name" : "Synchronicity",
    "link" : null
  }
}
  "id" : "/en/the_police",
  "/music/artist/album" : {
    "name" : "Synchronicity",
    "link" : "/music/album/artist"
  }
}

The master_property and reverse properties of a link aren't terribly useful in queries like these, since the very structure of the query shows us the property that is being followed. They are useful in a different style of link query (shown below), however, and are also useful with reflective queries (explained later in this section). Links do have properties other than master_property and reverse, however, and we can use a wildcard to discover them:

Query Result
{
  "id" : "/en/the_police",
  "/music/artist/album" : {
    "name" : "Synchronicity",
    "link" : { "*" : null}
  }
}
{
  "id" : "/en/the_police",
  "/music/artist/album" : {
    "name" : "Synchronicity",
    "link" : {
      "type" : "/type/link",
      "master_property" : "/music/album/artist",
      "reverse" : true,
      "source" : "Synchronicity",
      "target" : "The Police",
      "target_value" : null,
      "operation" : "insert",
      "valid" : true,
      "timestamp" : "2006-12-10T12:23:59.0685Z",
      "creator" : "/user/mwcl_musicbrainz"
    }
  }

}

The results of this query show a number of link properties. We've already seen type, master_property and reverse. source and target refer to the source and target of the link. Since we queried them here with a "*":null wildcard, we get the names of the source and target objects, rather than the objects themselves. If the target had been a primitive value (such as a name instead of an album), then the target_value property would have held the value of that primitive. (If the target is of /type/text, then the target property is the language of the text. If the target is of /type/key, then the target property is the namespace of the key.)

The valid and operation properties of a link specify the current validity of the link and the operation (such as insertion or deletion) that was performed on it. They are used when querying the history of a link, and are explained in Section 3.7.4. For now you just need to know that if you omit these properties from your link queries, Metaweb will only return links that are currently valid.

The timestamp and creator properties are like those defined by /type/object but they specify the creator and creation time for the link rather than for either of the two linked objects. Here is a query, for example, that asks about the creation (when and by who) of the objects that represent The Police, their album Synchronicity and of the link between those two objects:

Query Result
{
  "id":"/en/the_police",
  "timestamp":null,
  "creator":null,
  "/music/artist/album": {
    "name":"Synchronicity",
    "timestamp":null,
    "creator":null,
    "link": {
      "timestamp":null,
      "creator":null,
    }
  }
}
{
  "id":"/en/the_police",
  "timestamp":"2006-10-22T10:02:03.0012Z",
  "creator":"/user/metaweb",
  "/music/artist/album": {
    "name":"Synchronicity",
    "timestamp":"2006-12-10T12:23:59.0119Z",
    "creator":"/user/mwcl_musicbrainz",
    "link": {
      "timestamp":"2006-12-10T12:23:59.0685Z",
      "creator":"/user/mwcl_musicbrainz"
    }
  }
}

Here's another link query that demonstrates the /type/link/creator property as well as /type/link/target_value. It asks: "what is the name of the country Spain in French, and what Metaweb user contributed the translation?":

Query Result
{
  "id" : "/en/spain",
  "name" : {
    "link" : {
      "target" : "French",
      "target_value" : null,
      "creator" : null
    }
  }
}
{
  "id" : "/en/spain",
  "name" : {
    "link" : {
      "target" : "French",
      "target_value" : "Espagne",
      "creator" : "/user/mwcl_wikipedia_en"
    }
  }
}

The link object described in the response represents a link between a /location/country object and a /type/text value. Since /type/text is a primitive, the target_value property holds the value of the primitive – the text itself. Since the target is of /type/text, the target property represents the language of the text, and we use this fact in the query to specify that we're interested in the French version of the name. It is worth noting that we specify the language by name here, rather than by id with /lang/fr as we usually do. The reason is that the /type/link/target property has an expected type of /type/object (since it can represent any object). The default property of /type/lang is id, which is why we normally specify languages by id. But in this case, it is the default property of /type/object that matters, which is why we must specify the name of the desired language rather than its id. (We could also have written "target":{"id":"/lang/fr"}

3.7.2. Toplevel Links

We saw above that the link directive allows us to query the link between the object or value that matches a sub-query and the object that matches its parent query. We can't use the link directive in a toplevel query, however. If you want to write a toplevel link query you must take a different, but simple, approach. Just include this constraint in your query:

"type":"/type/link"

/type/link is not a real type and no Metaweb objects have this type. But putting the line above into a MQL query tells Metaweb that you're interested in links rather than objects. When you write link queries of this sort, the source and target properties of /type/link typically become important to either constrain or query the source and destination of the link. (But note that you must write "type":"/type/link" in order to use the source and target properties: MQL does not allow you to use fully-qualified names for link properties, so you cannot omit the type specification and just query /type/link/target and /type/link/source.)

Here is how we query all outgoing links from The Police, asking for the name of the property that represents the link and primitive value or the name, type and id of the object or value at the other end of the link.

[{
  "type" : "/type/link",
  "source" : {
    "id" : "/en/the_police"
  },
  "master_property" : null,
  "target" : {},
  "target_value" : null
}]

And here is how we ask instead for the name, type, and id of all objects that are linked to The Police:

[{
  "type" : "/type/link",
  "target" : {
    "id" : "/en/the_police"
  },
  "master_property" : null,
  "source" : {}
}]

We can also query links to primitive values. The following query, for example, asks for objects that are linked to the date July 4th, 1776:

[{
  "type" : "/type/link",
  "target_value" : {
    "type" : "/type/datetime",
    "value" : "1776-07-04"
  },
  "master_property" : null,
  "source" : {}
}]

Toplevel link queries can be used in conjunction with return:count. This query asks: "how many links did the user "wp_typer" create to type objects as /music/artist?":

{
  "return" : "count",
  "type" : "/type/link",
  "creator" : "/user/wp_typer",
  "master_property" : "/type/object/type",
  "target" : {
    "id" : "/music/artist"
  }
}

3.7.3. Reflection

Reflection is a MQL feature that is closely related to links. It is a mechanism for querying the properties of an object, regardless of the type that defines those properties. The * wildcard described earlier in this chapter is a type-based wildcard: it queries the value of all properties defined by a type, plus the common properties defined by /type/object. Reflection is different: it is a link-based wildcard mechanism that queries the outgoing or incoming links of an object, regardless of the type associated with those links.

Reflection is done with /type/reflect. Like /type/link, /type/reflect is a pseudo-type and objects are never assigned this type. /type/reflect exists simply to serve as a holder for three special properties: /type/reflect/any_master, /type/reflect/any_reverse, and /type/reflect/any_value. These properties must always be used by their fully-qualified names. The word "any" in these property names indicates their wildcard behavior. /type/reflect/any_master matches any outgoing link to another object. /type/reflect/any_reverse matches any incoming link from another object. And /type/reflect/any_value matches any link to a primitive value such as /type/text, /type/datetime or /type/float.

The following query uses all three of these properties on The Police. The results shown are dramatically pruned to fit. Note that each of the sub-queries uses the link directive to ask for the name of the property that was matched (recall that the default property of links is master_property):

Query Result
{
  "id":"/en/the_police",
  "/type/reflect/any_master":[{
    "link":null,
    "name":null
  }],
  "/type/reflect/any_reverse":[{
    "link":null,
    "name":null
  }],
  "/type/reflect/any_value":[{
    "link":null,
    "value":null
  }]
}
{
  "id" : "/en/the_police",
  "/type/reflect/any_master" : [{
    "link" : "/type/object/type",
    "name" : "Musical Artist"
  },{
    "link" : "/music/artist/genre",
    "name" : "Rock music"
  },{
    "link" : "/music/artist/origin",
    "name" : "London"
  },{
    "link" : "/common/topic/webpage",
    "name" : null
  },{
    "link" : "/music/artist/label",
    "name" : "Polydor Records"
  }],
  "/type/reflect/any_reverse" : [{
    "link" : "/music/album/artist",
    "name" : "Outlandos d'Amour"
  },{
    "link" : "/music/album/artist",
    "name" : "Reggatta de Blanc"
  },{
    "link" : "/music/track/artist",
    "name" : "Message in a Bottle"
  },{
    "link" : "/music/track/artist",
    "name" : "Can't Stand Losing You"
  }],
  "/type/reflect/any_value" : [{
    "link" : "/type/object/name",
    "value" : "The Police"
  },{
    "link" : "/common/topic/alias",
    "value" : "Police"
  },{
    "link" : "/music/artist/active_start",
    "value" : "1977-01"
  },{
    "link" : "/music/artist/active_end",
    "value" : "1986-06"
  }]
}

Note that the /type/reflect/any_value does not actually return every value of an object. The id, key, timestamp and creator properties are not matched by /type/reflect/any_value. Since every object has values for these properties, however, it is easy to write queries that ask for them explicitly. Furthermore, /type/reflect/any_value never matches any property whose value is a /type/id or a /type/key. In particular, this means that an any_value query on a namespace object will not match the /type/namespace/keys properties that identify the names in the namespace.

Another difficulty with /type/reflect/any_value is that it is tricky to ask for the lang property of text values. The expected type of any_value is /type/value which does not have a lang property. This means that you can't use the unqualified lang property. But MQL will not allow you to use the full property name /type/text/lang in a reflective query. Also, a wildcard query "*":null in any_value expands only to the type and value properties. If you're interested in reflecting on text values only, you can just do this:

{
  "id":"/en/the_police",
  "/type/reflect/any_value": [{
    "type":"/type/text",
    "link":null,
    "value":null,
    "lang":null
  }]
}

But if you want to query the language id of text values without restricting your reflective query to return only text values, you must do something like this:

{
  "id" : "/en/the_police",
  "/type/reflect/any_value" : [{
    "*" : null,
    "link" : {
      "master_property" : null,
      "target" : {
        "id" : null,
        "optional" : true
      }
    }
  }]
}

We'll end this discussion of reflection with a more advanced query. It asks for the id and type of objects that have outgoing links objects named The Police and Sting. Two results are shown, including the important /music/group_membership object that specifies that Sting was a member of The Police:

Query Result
[{
  "id":null,
  "type":[],
  "first:/type/reflect/any_master": {
    "name":"Sting",
    "link":null
  },
  "second:/type/reflect/any_master": {
    "name":"The Police",
    "link":null
  }
}]
[{
  "id":"/guid/9202a8c04000641f8000000003924426",
  "type":["/music/group_membership"],
  "first:/type/reflect/any_master": {
    "link":"/music/group_membership/member",
    "name":"Sting"
  },
  "second:/type/reflect/any_master": {
    "link":"/music/group_membership/group",
    "name":"The Police"
  }
},{
  "id":"/user/saraw524",
  "type": [
    "/type/user",
    "/type/namespace",
    "/freebase/user_profile"
  ],
  "first:/type/reflect/any_master": {
    "link":"/freebase/user_profile/favorite_music_artists",
    "name":"Sting"
  },
  "second:/type/reflect/any_master": {
    "link":"/freebase/user_profile/favorite_music_artists",
    "name":"The Police"
  }
}]

A similar query could be written to find objects that link to both Arnold Schwarzenegger and Maria Shriver: it would find objects representing their marriage and their children.

3.7.4. History

Objects in a Metaweb database live forever: once created they can never be deleted. The closest we can come to deleting an object is to remove all links from and to it. Somewhat more surprising is the fact that links live on forever in Metaweb, too. A link may be deleted or replaced with a new value, but the historical existence of that link is retained. We've already seen that we can query the creation timestamp of any object or any link. But there is another kind of history query we can express with MQL as well. The valid property of a link specifies whether the link is currently valid or not. The following query, for example, finds the most recently invalidated link between an object and a name. The target_value property returns the old name of the object, and source.name returns the name that has replaced it.

[{
  "type" : "/type/link",
  "valid" : false,
  "master_property" : "/type/object/name",
  "source" : {},
  "target_value" : null,
  "limit" : 1,
  "timestamp" : null,
  "sort" : "-timestamp"
}]

It is important to understand that MQL link queries normally only return valid links. You can explicitly request links that are no longer valid with "valid":false, and you can request links that are either valid or invalid with "valid":null or by using a wildcard like "link":{"*":null}. All other link queries (with one exception involving the operation property that we'll learn about below) are made with an implicit "valid":true. As an example, consider the following query for the name of the /finance/currency object and the date that it was given its name:

Query Result
{
  "id" : "/finance/currency",
  "name" : {
    "value" : null,
    "link" : {
      "timestamp" : null
    }
  }
}
{
  "id" : "/finance/currency",
  "name" : {
    "value" : "Currency",
    "link" : {
      "timestamp" : "2007-03-25T00:33:28.0000Z"
    }
  }
}

The name of the object is "Currency" and it has been since March 25th, 2007. Now let's alter the query slightly to ask for links of any validity:

Query Result
{
  "id" : "/finance/currency",
  "name" : [{
    "value" : null,
    "link" : {
      "valid" : null,
      "timestamp" : null
    }
  }]
}
{
  "id" : "/finance/currency",
  "name" : [{
    "value" : "currency",
    "link" : {
      "valid" : false,
      "timestamp" : "2006-10-22T07:34:51.0008Z"
    }
  },{
    "value" : "Currency",
    "link" : {
      "valid" : true,
      "timestamp" : "2007-03-25T00:33:28.0000Z"
    }
  }]
}

This query tells us that on October 22nd, 2006 the /finance/currency object was given the name "currency" (with a lowercase c), but that that name was changed to "Currency" on March 25th of the following year. Note that the name query is now written using square brackets. When querying link history, we must use square brackets even with unique properties because the property may have had more than one value over time.

Metaweb makes a record of every insertion, deletion and update of a link, and the operation property of a link allows us to use this in queries. The possible values of this property are "insert", "delete" and "update". Links that correspond to unique properties (including the name property) can be inserted, deleted, or updated. Links that correspond to non-unique properties, however, are never updated: they can only be inserted and deleted.

We use the operation property in the following query to find types that have been deleted (that is: objects that used to be linked to /type/type, but are no longer) and also ask when they were deleted and by whom. (Note that the creator property of a link can also refer to the user who deleted or updated a link as well).

[{
  "type" : "/type/link",
  "operation" : "delete",
  "master_property" : "/type/object/type",
  "source" : {},
  "target" : { "id" : "/type/type" },
  "timestamp" : null,
  "creator":null,
}]

Note that this query explicitly asks for links that have been deleted. This means that it returns invalid links even though it does not include "valid":null or "valid":false. If you merely query the operation property with "operation":null, you will not get invalid links unless you also query or constrain the valid property. If you query the operation property without using the valid property, the results you get will only include insertions and updates, not deletions, because a link that has been deleted is, by definition, no longer valid. The valid property of a deleted link is actually null, not false, so writing a query for links that are invalid will only return links that have been updated, not those that have been deleted. To find deleted links, you must explicitly use "operation":"delete".

Here's a complex query that asks for type objects that have had their English names changed. It also asks when those changes were made, and by who. It tells us, for example, that the type /location/province was originally named "Province", but that /user/colin updated the name to "Canadian Province" in January 2007. Then /user/jeff updated the name to "CA Province" in August 2007 and then updated it back to "Canadian Province" in November 2007. Rather than using a top-level /type/link query, this query uses two different link directives to find both the original insertion of the name and also all subsequent updates to the name.

[{
  "type":"/type/type",
  "id":null,
  "original:name":[{
    "value":null,
    "link": {
       "operation":"insert",
       "valid":false,
       "timestamp":null,
       "creator":null
    }
  }],
  "new:name": [{
    "value":null,
    "link": {
       "operation":"update",
       "valid":null,
       "timestamp":null,
       "creator":null
    }
  }]
}]

Note again that the name queries are surrounded by square brackets because historical queries must be expected to return multiple results. This query does not ask about name deletions: it assumes that names are changed by updates rather than deletions and re-insertions. We can write a simpler and more general query to ask about the complete name history of type objects:

[{
  "type" : "/type/type",
  "id" : null,
  "name" : [{
    "value" : null,
    "sort" : "link.timestamp",
    "link" : {
      "valid" : null,
      "operation" : null,
      "creator" : null,
      "timestamp" : null
    }
  }]
}]

The problem with this query is that it matches any type with a name, even types that have never had the name changed. We can fix this by requiring that the type object have at least one currently invalid name link:

[{
  "type" : "/type/type",
  "id" : null,
  "number_of_invalid:name" : [{
    "return" : "count",
    "link" : { "valid" : false  }
  }],
  "name" : [{
    "value" : null,
    "sort" : "link.timestamp",
    "link" : {
      "valid" : null,
      "operation" : null,
      "creator" : null,
      "timestamp" : null
    }
  }]
}]

[7] MQL is pronounced "mickle". It rhymes with "pickle", not "sequel".

[8] You should read this section even if you already know JavaScript. JSON is only a subset of JavaScript, and its syntax is stricter than JavaScript syntax.

[9] The JSON syntax diagrams that appear below are also from the JSON website, where they have been placed in the public domain.

[10] JSON itself supports 32-bit, 16-bit and 8-bit encodings of Unicode text. Metaweb, however, requires the 8-bit UTF-8 encoding.

[11] Objects may actually have more than one name, but may only have one name in any given language. For this reason, simple name queries only return one value. We'll see more about this later in the chapter.

[12] September, 2008

[13] Metaweb's ordered collections are sometimes described as lists, but this term is inaccurate because lists are allowed to have duplicate elements. Metaweb's ordered collections are still fundamentally sets, and duplicates are not allowed.

[14] If you've done programming with languages like Perl or Ruby, this syntax should look familiar. If you've worked with SQL queries for relational databases, ~= is like the SQL % operator. Otherwise, think of "~=" as meaning "approximately equal" or "like".