What does a typical Note-crafter actually do. A Typology of Note-taking Activities
What does a typical Note-crafter actually do? A Typology of Note-taking Activities
A brief but important clarification on the origins of this report.
My name is Rustam Agamaliev. I am a teacher, researcher, and the author of several books and numerous publications. Currently, my work lies at the intersection of pedagogy, cognitive science, linguistics, and psychology; I primarily work with children and university students.
For the last 7 years, I have been cultivating a community dedicated to Knowledge Management, Note-craft, and information processing. It is this specific part of my activity that I want to discuss in the report below.
The goal of this work was to identify the pattern-based actions of a person who records their thoughts and engages in self-directed learning. I performed the analysis of the language corpus using Python and Jupyter/Marimo notebooks—something I was doing for the very first time. The result is presented below.
Professionals in data analytics may spot flaws in my methodology and offer alternative approaches; I would be delighted to read your feedback.
For about a month now, I have been hatching a research plan to evaluate the effects of Note-craft (Note-taking craft) on a person’s professional and personal life. And every time I approached this beast1, people whose opinions matter would ask me the same question: “So, what exactly is Note-craft? What do you mean by taking notes? What process are your referring to?”
Every attempt to explain it crashed against a wall of misunderstanding. My conversation partners expressed total confusion that I sensed in their questions and comments. Ultimately, they would admit they just didn’t get it, forcing me to go back to the drawing board (Canvas). In an attempt to figure out what Note-craft actually is, I studied the (many) notes in my own vault, drew multiple conceptual maps, polled our community members, and spoke with professionals from various fields.
Whenever I asked someone how they handled their personal and professional records, they would invariably:
- Describe the technical features of their apps and services;
- Start to drone on2 about details;
- Or say they capture things in an app and “process it later”—without ever specifying what “processing” means or when that “later” actually arrives.
All in all, it seemed there was an obvious problem with the operationalization of the concept of “Note-craft”—much like with “Task-churning”3, “Creativity,” or any other “complex concept” requiring interpretation. Nevertheless, I didn’t stop searching for a definition for Note-craft, though I failed to find a single formulation that encompassed the full spectrum of the activity.
At some point, a friend of mine, another researcher, gave me a hint on what to do. He suggested an approach on how to operationalize the concept of “Note-craft” not by asking for a definition, but by asking for behaviors: What do people do when an idea hits them? How do they describe the way they “later” process their notes? What actually happens during the process of “thoughtful” (deep) learning?
So, how do we do this?
Part 2: The “Fly on the Wall” Method
I tried messaging everyone privately to ask: “What do you actually do when you write notes?”
To be honest, that turned out to be a mediocre idea. The result was the same old gibberish that didn’t help me understand the concept of “Note-craft” at all—only now I had to spend minutes (sometimes days) going back and forth in chat threads. In short, I gave up on the idea of personal interviews and got a bit discouraged… until I was introduced to the approach known as “Grounded Theory.”
Turns out, that’s actually a thing.
According to this theory, when searching for the answer to “What is Note-craft?” or other abstract and “complex question”, I shouldn’t start with a hypothesis, instead go with a clean slate. I must enter a state where I know nothing about how to work with notes. I have no assumptions about the process. I must forget that I am familiar with the works of Sönke Ahrens and our private conversation with him, and I must pretend I don’t know who Niklas Luhmann is. This approach, at least in theory, guarantees that instead of relying on what I “know,” I collect evidence and study what the data “tells” me, building my hypotheses and opinions solely on what I have seen and read.
But a question remained: How do I “observe” another person’s Note-craft?
In a school or university setting, this wouldn’t be a problem: I would give the students a task, step aside, watch closely, record my thoughts on what I saw, and identify pattern-based actions. But how do I do that with “remote” audience?
At that exact moment, a copy of Harry Potter and the Methods of Rationality landed on my desk—my child had dragged it in after “cleaning” their room. “That’s it!” I thought.
In Eliezer Yudkowsky’s book, there is a character—a reporter—who could turn into a fly (or some other small bug) and observe all the events she was interested in while in that form. I decided to use this image: I would become the fly.
I created a questionnaire with three specific situational questions:
Imagine that I am a ghost (or an unobtrusive fly) following you for 24 hours. You are away from your desk—perhaps walking somewhere, sitting in a café, or talking to a friend—and a brilliant idea hits you (or so it seems at the time).
What strictly physical actions of capturing this idea will I, as the ghost standing over your shoulder, see?
Let’s flash forward in time. You have returned to your computer or the place where you usually think about your tasks, ideas, and projects.
What happens to that “record” you made earlier when I was following you as a ghost?
When you are purposefully studying something complex (like a book or a course), is there one specific habit that makes you feel “productive”?
In designing these questions, I tried to create a context where the respondent would be forced to describe how they act in different conditions (quickly on the go, at the computer, and when studying) rather than just telling me that they take notes. I managed to collect 185 responses. Now, I had to “work” something out of them.
The key problem I needed to solve was how to wade through the typos, conversational slang, and neologisms to arrive at some conclusions. A massive task lay ahead—one unfamiliar to me—of decoding these linguistic constructions. And that is what the next part of this story is about.
The Research Architecture
When processing the data, I didn’t rely on guesswork, nor did I ask AI to do the analysis for me. Instead, I used a mixed methods approach, evaluating both quantitative and qualitative parameters.
Let’s start with how I extracted the data. I repeat: I did not feed the answers into an AI and ask it to “analyze this.” In that scenario, the results would be the product of a “black box,” and that is not what we need. Instead, I created a 4-stage filter.
1. Normalization
Natural language is “dirty.” A computer doesn’t understand that “vn,” “voice msg,” and “audio” are all the same entity. Therefore, using a Python script, we created a map of regular expressions to unify the data. Here is an example of the transformation:
- Input Data: “Dropping a vn in tg”
- Transformation:
s/tg/telegram/,s/vn/voice_message/ - Clean Data: “Dropping a voice_message in telegram”
Why do we need this? Without this kind of “cleaning,” counting verbs and action objects (remember, we are trying to understand what Note-craft is and what actions it is associated with) would be impossible. We would simply dilute the data with conversational slang and learn nothing.
2. Tokenization (Atomization)
Next, we atomized ALL 185 responses down to individual words (tokens). We compiled a list of “stop-words” consisting of prepositions and pronouns (though I later noticed I didn’t manage to “stop” all the pronouns), and we isolated the verbs and nouns. As a result, we ended up with 2000+ unique words (tokens).
For this atomization, we used the morphological library pymorphy2—a Russian-speaking “robot” (because my respondents were answering in Russian) that automatically identifies the roots of all words and reduces them to their base form. We “manually” ran words through this robot. If it saw, for example, the root “-writ-” (to write), it automatically classified words like “wrote,” “writing,” and “record” into the single category “WRITE.”
There was probably a more elegant way to do this, but hey—I don’t know how.
Concept Clouds and Clustering

what is note-taking word cloud.jpg
Words clouds are in Russian because, again, my respondents answered in Russian :)
This is the moment where we turned to Grounded Theory: we didn’t categorize anything ourselves. We simply observed which words appeared most frequently and in what context. We analyzed the ratio of verbs to subjects that, in our worldview and this specific survey, correlated with a tool or an action.
We “manually” examined the context for the top ten most popular concepts and created a “dictionary”2 to map synonyms. But that wasn’t enough.
We still needed a way to understand which words belonged to which behavior. Thus, the idea of Note-crafter “Archetypes” was born. For example, the cluster of words [delete, trash, and clean] corresponded to the Archetype “The Cleaner.”
In total, we identified 15 Archetypes3. Each type is characterized by specific behavior, distinct actions, keywords, and a “Base Note-crafting Action” (which we will use in the next study, so I won’t reveal it just yet4):
| Stage | Archetype | Typical Behavior | Keywords (Markers) |
|---|---|---|---|
| CAPTURE | The Messenger(Self-Chat) | Uses a messenger as a quick inbox. Forwards messages to themselves. | telegram, tg, whatsapp, saved messages, dm, forward, chat |
| CAPTURE | The Voicer(Voice) | Prefers talking over writing. Often uses AI for transcription. | voice msg, vn, dictaphone, dictate, siri, audio, whisper |
| CAPTURE | The Analogist(Analog) | Trusts only physical media. Ignores digital methods while on the go. | notebook, paper, napkin, pen, moleskine, sticker, sheet, hand-write |
| CAPTURE | The Architect(App Direct) | Opens the destination app immediately. Hates temporary folders. | obsidian, notion, evernote, keep, notes app, apple notes, google, bear |
| CAPTURE | The Optimist(Memory) | Relies on memory. Believes that “if it’s important, I won’t forget it.” | remember, nothing, hope, memory, later, in my head, recall |
| PROCESS | The Cleaner(Cleaner) | Goal: Inbox Zero. Deletes the noise, keeps only the 10% essence. | delete, sort out, trash, clean, remove, archive, unnecessary, wipe |
| PROCESS | The Mover(Mover) | Data logistics. Simply moves the note from Inbox to Archive without changes. | move, copy, inbox, transfer, drag, drop, file away |
| PROCESS | The Librarian(Tagger) | Focuses on retrieval. Adds metadata, tags, and folders for the future. | tags, link, folder, category, sort, structure, catalog, connect |
| PROCESS | The Refiner(Refiner) | Creative processing. Rewrites the rough draft into a final copy. | finish writing, edit, rewrite, polish, in my own words, essence, rephrase |
| PROCESS | The Skipper(Hoarder) | Skips the processing stage. Notes pile up as “dead weight.” | nothing, remains, rarely, whatever, stay, leave, forget, pile up, ignore |
| STUDY | The Cartographer(Mapper) | Visual learner. Understands info only when drawing connections between ideas. | map, scheme, draw, mind map, canvas, graph, arrows, visual, miro |
| STUDY | The Collector(Highlighter) | The Gatherer. Highlights important parts in color and saves quotes. | highlight, marker, color, underline, quotes, copy-paste, save |
| STUDY | The Translator(Translator) | Deep processing. Rewrites the author’s text into their own language/understanding. | own words, rewrite, summary, essence, retelling, understanding, explain |
| STUDY | The Structuralist(Structuralist) | Linear thinking. Loves lists, bullets, and clear hierarchy. | structure, header, plan, table of contents, bullet, list, hierarchy |
| STUDY | The Examiner(Tester) | Active Recall. Turns notes into questions for self-testing. | question, answer, anki, flashcards, test, exam, check, recall |
However, these types are not unique—a single person can embody a multitude of them. Here is an example from my friend, Dima Lauhin:
| Context | Dima’s Answer | The Robot’s Verdict |
|---|---|---|
| Q1: Capture | ”I take out my iPhone, hold the Action Button, dictate by voice; the result is transcribed and sent to today’s Daily Note in Obsidian…” | The Architect(Keywords: obsidian, note) |
| Q2: Process | ”I process accumulated records… I create a new note based on what was recorded, or I copy it into an existing one…” | The Mover (Keyword: copy) |
| Q3: Study | ”A single note with a course summary… For books, I read in Zotero… I send highlights to Obsidian using a template.” | The Translator(Keyword: summary) |
At this point, it is appropriate to make a very important clarification about the limitations of this typology.
The robot we “assembled” sees only what we taught it to see. If someone records their thoughts in an atypical way—for example, with a marker on their forearm—the robot will “miss” it and send this Note-crafter to the “Unclassified” group (of which, by the way, there were 88 people—but we sorted them out too, though I won’t bore you with the details).
The second flaw of such a typology is the robot’s blindness to context: it sees keywords, not intent. It cannot understand what was meant when Dima wrote “copy” during the processing stage. Here is what Dima said when I showed him the Archetype table:
“Well, on the whole, it’s fair. But the conclusion about ‘Processing’ is debatable based on the word ‘copy’ in my answer. That specific conclusion doesn’t match my practice.
I don’t just forward things from Inbox to Archive without changes. In my answer, ‘copy’ meant taking specific parts of the source note to use as a basis for creating a new one—not copying the whole thing.”
It is highly probable that if someone writes “I hate Obsidian” in their response, our robot will see only the word Obsidian and classify this respondent as The Architect.
I, of course, manually combed through the data. Nevertheless, in such a volume (2000+ words), I likely missed things. But that is unimportant, because the goal of this study was to find pattern-based actions, not to achieve high-precision individual profiling.
And the final risk is related to the nature of the data: we didn’t watch what people did; we asked them. In this situation, a person usually describes their “Ideal Self.” They might think they are a Structuralist5, but if you actually analyze their activity, they are a Collector6.
Tagging
After creating dictionaries for each archetype, we ran a script that “watched” and classified all the responses. Initially, we had a certain number of unclassified answers, but we shoehorned them in, so eventually, every respondent received a category.
What’s next, and what if you don’t take my word for it?
To “prove” that everything I’ve said isn’t just a figment of my imagination, I am attaching links below, just as I did with my previous study (about the impact of smartphone use on cognitive capabilities of children). Via the links, you can find all the source code, the Python commands used, the clean and dirty data, etc.
A final personal comment: I have a feeling that thanks to this work, I managed to understand a little better what Note-craft actually is. It seems I was able to create a conceptual map (as it is called in science) of the territory.
And one more thing: next week (I hope) I will launch a new survey. If you wish to participate, write a comment, I might be able to adapt it to English speaking respondents.
The result of the work described above is a list of fifteen statements7 (one for each archetype) that characterize us as Note-crafters. I am very interested to see how the actions described in these statements correlate with how long we have been practicing Note-craft, our professions, our successes, and our failures. That is exactly what next survey will be about.
Links, Sources, and Secret Stashes. Mostly in Russian
Here is the dirty data with all the responses + every respondent categorized by their corresponding archetype. You can find yourself and check how fairly you were assigned an archetype.
And here is the rest of the data8—to save you the hassle of clicking multiple links for individual files, I suggest grabbing them all from my cloud server.
BIO
🧠 theBrain mapping
ID: 202512231245 Source:: Friend:: Child:: Next::
Footnotes
-
And a huge thank you to these people. Without their pedantry, what you are reading right now would not exist. ↩
-
It could have been fewer. I later re-ran the whole algorithm—not in a Jupyter Notebook, but in a Marimo Notebook—and managed to consolidate them into 11 archetypes. However, in Marimo I worked “horizontally” (analyzing one person’s full process end-to-end), whereas in Jupyter I worked across the whole process—grouping by actions in specific contexts (Capture, Process, Study) for all 185 respondents. The latter seemed more appropriate since we are, after all, operationalizing activity. ↩ ↩2
-
if you liked reading this survey and wish to participate in the next, post a comment :) ↩
-
Structuralist: Linear thinking. Loves lists, bullets, and clear hierarchy. ↩
-
Collector: The Gatherer. Highlights important parts in color and saves quotes. ↩
-
I am intentionally not publishing them in this research report so as not to “prime” (bias) those who decide to take the upcoming survey. ↩
-
I didn’t install an SSL certificate, so the browser might complain about “security.” Just click “proceed anyway” (or “advanced” → “continue”), and you will land in my Nextcloud. ↩