no.16 - Deep Dive into Tana's Data Structure for AI.
Why Tana is Super Powerful to Leverage Artificial Intelligence.
I have a Master's degree in Artificial Intelligence.
Trust me when I say this:
- Tana was built with AI in mind.
This is NOT just another app that decided to implement some nice features using ChatGPT. It's much more than that.
Tana works much like a programming environment.
As if Notion, Roam Research, and VS Code had a baby who speaks Python.
The intersection between note-taking and computer software.
You can reliably store knowledge and also execute commands using your knowledge.
This is possible because Tana's core features come from Computer Science and Software Engineering.
That's where we'll start…
Before we discuss why Tana is powerful for AI,
Let's first understand how the core features in Tana are related to fundamental components of Information Science.
Nodes, Graphs, Classes, OOP, Knowledge Bases.
💬 4 Quotes
Quote 1
“In Tana everything is a Node.” – Lukas Kawerau (Cortex Futura)
Nodes and Edges are core pieces of Graphs 🕸
A Graph (G) is a mathematical structure used in Information Science that models Networks & Relationships: G = (V, E).
- V = Vertices (nodes)
- E = Edges (links)
Nodes are points (intersections)
Edges are links (relationships)
Tana's data structure is a huge Graph.
Everything is stored as Nodes and connected through Edges.
Quote 2
“A Knowledge Base (KB) is a collection of structured data about entities and relations.” – Gerhard Weikum, author of Machine Knowledge
Tana is not just a Graph.
Tana is a Knowledge Graph.
A Knowledge Graph is the combination of a Graph with a Knowledge Base (KB).
And as this quote explains, a Knowledge Base has structured data.
More specifically, they represent entities and their semantic types.
By definition, an Entity is a "thing" that exists.
Pretty simple, right?
This is where semantic types come into play, to categorize things into more specific types of entities.
Tana assigns semantic types using Supertags.
Supertags represent different types of entities.
Such as:
- Tasks
- Projects
- People
- Companies
Additionally, a Knowledge Base also represents attributes and relationships.
- Attributes (fields) hold data about entities.
- Relationships model how entities (supertags) relate to one another.
Tana is actually a Huge Knowledge Graph.
In a nutshell: (summary ↓)
- Supertags represent semantic types.
- Supertags have fields to hold data.
- Supertags have specific relationships with other Supertags.
When you apply a Supertag to a Node:
The Node becomes an entity that belongs to a semantic type, with fields to hold data and relationships to other entities.
Quote 3
“Supertags are equivalent to classes in Object-Oriented Programming (OOP). Tana introduced Object-Oriented Note-taking.” – Fis Fraga
First of all, let's normalize quoting ourselves 😝.
Second, Classes and OOP are a major analogy to help understand Tana.
First, let me quickly define OOP.
OOP, or Object-Oriented Programming, is a paradigm of coding where software is designed using objects.
An Object is a Data structure that has:
- Data - Information
- Code - Actions
(See example below)
But objects get their Data and Code from their Class.
Every object belongs to a class.
Classes define the data and the code that will be available to objects.
And Supertags…
Supertags work exactly like Classes.
Supertags define the fields and commands that will be available to nodes.
Quote 4
“Supertags are the beating heart of the Tana Graph and were made with AI in mind from the start. Tana AI understands your data, graph and applications so it can do more than just writing and summarizing text.” – Olav Kriken, co-founder of Tana
Just in case you're wondering if I made up all of this mumbo-jumbo about Information Science, Knowledge Bases, and Object-Oriented Programming…
Here is one of Tana's co-founders backing me up!
Tana was made with AI in mind from the start.
It's NOT a coincidence that Supertags fit perfectly when compared to BOTH:
- Entities and Semantic Types in KBs.
- Objects and Classes in OOP.
Tana AI understands your data because Supertags are based on tested principles that come from Information Science.
📄 3 Notes
1. Deep Dive into Supertags = Classes
This note builds upon the idea of Objects having data and code.
This is very important to understand how Tana works.
But first, let's understand how Objects and Classes are related.
There are 2 definitions for a Class:
- 🗺 Blueprint of an object
- 🏢 Factory that creates objects
And of course, these 2 are related.
The class is both the designer and the creator of objects.
Now, coming back to Tana…
Supertags also play these 2 roles inside Tana.
A Supertag holds the design for objects and creates an object when added to a node.
The image below shows the Supertag definition, the equivalent Class definition, and the object created using this Blueprint.
A Supertag has the exact same structure as a Class and is used in the same way, to create objects.
But things get better!!! 😱
We covered the data, but what about the code?
That's where the magic happens.
Tana recently launched commands.
Commands act exactly as the Code does in a Class.
You have the option to configure a command that can be executed with a button (or using the command line).
The example below shows the command defined inside the Supertag (in the Advanced section), the equivalent function defined in the Class, and finally, the button appearing in the object, to run the command.
When you click the button, the "Read” field is set to ✅.
Of course, this is a very simple example.
You can create commands that do all sorts of complex manipulations in Tana.
(Including AI commands using GPT.)
Tana works like software.
2. Tana is a Programming Environment
Tana works like software.
If you can hold data and run commands, we are clearly talking about a programming environment.
Tana works just like a classic computer program, which has different objects, each with its own data, that interact with other objects to fetch and send data across your workspace.
Tana provides a massively powerful algorithmic structure to create workflows and use AI to create apps.
The Tana team is responsible for building the data structure and interface.
Users are responsible for building apps that solve specific problems.
Tana becomes a programming environment and power-users are programmers that create and share apps with others.
3. LLMs in Tana - Perfect Combo
Tana's structure is perfect to use LLMs.
LLMs (Large Language Models) are Transformers.
A great example is GPT: T stands for Transformer.
- Generative Pre-trained Transformer
A Transformer is an architecture that revolutionized Natural Language Processing. See the original Paper for details.
Let me briefly explain what a Transformer is, and why Tana is the perfect place to use Transformers.
What is a Transformer
Transformer models are pre-trained on large amounts of raw text and develop a statistical understanding of the language.
But the most important part of a Transformer is transfer learning.
Transfer learning is when you fine-tune the model using specific data. This transfers the (general) statistical knowledge from the entire language into a specific use case.
So that the Transformer solves a specific problem.
HOW Transformers solve problems
Here's where things get interesting.
Transformers use an Encoder-Decoder architecture.
- Input is Encoded.
- Transformer manipulates data.
- Data is Decoded into the Output.
Encoder: Receives input and builds a representation of its features. Acquires understanding from the input.
Decoder: Uses encoded representation (features) along with other inputs to generate outputs.
Why is this good for Tana?
Well, this is where everything in this letter comes together.
- Object (type A) has fields, which hold data.
- Fields can point to Objects of other Supertags (type B).
- Objects can have AI Commands using LLMs.
- LLMs are Transformers = Encoder-Decoder.
- Encoder receives input data.
- Decoder generates output data.
What happens when you chain everything?
Objects can have AI commands:
- Field A holds input data.
- Input data goes into the LLM.
- LLM generates output data.
- Data is saved to a new Field B.
NOW…
The new output data in the new field can be used as input to OTHER commands.
Tana's fields (data) are perfect to create a chain of commands (code), that grab information from one field, and use it to generate data inside other fields.
And so on…
Great things are coming with Tana's architecture!
P.s. Bonus: If you want a geeky expanded version of the chain of commands:
The LLM transformer will receive data from multiple fields in your Object (type A), and also data from fields in the related Object (type B), then the LLM will encode everything into the Encoder-Decoder architecture, then decode it to generate an output that can be saved as a new field in the Object (type A). Which will give sequence to other cycles.
🔗 Further Reading
Link 1: X (Twitter) Thread: Software Engineering principles in Tana.
Check it out to go deeper into this analogy. It's more of a step-by-step explanation.
Link 2: Research Paper
Read my latest paper: Creating Automatic Connections for Personal Knowledge Management. Published with Springer Nature Computer Science.
Thank you for reading!
Fis Fraga, M.Sc. is a Tana Ambassador and digital writer. He helps people develop a productive and fulfilling life using a mix of Knowledge Management, and Artificial Intelligence.
You can read more at: