New! Try Málstaður, a platform for all of Miðeind's main products.

Sagan

Miðeind is a leading software company in the fields of Language Technology and Artificial Intelligence

Vilhjálmur Þorsteinsson

Vilhjálmur Þorsteinsson, founder and owner of Miðeind, has been involved in Icelandic information technology for well over three decades. He founded his first startup company in 1983, at the age of 17.

How did it come about that you started programming and founded your own company at such a young age?

"I had a tremendous interest in this brand new phenomenon that personal computers were at the time. Around 1980, when the first personal computers arrived in the country, I managed to secure a summer job programming them, which then extended.

In 1983, my friend Örn Karlsson and I decided to create new Icelandic business software from scratch and subsequently founded the company Íslensk forritaþróun (Icelandic Software Development). The software we developed was eventually used in about 1,300 companies in Iceland for accounting and operations."

As can be seen, Vilhjálmur was drawn into software development early on and worked in it for many years. The company Íslensk forritaþróun was then sold to a British company around 1998. Vilhjálmur continued to work for the British company for about two years, but then he started itching to do something new. Around the turn of the millennium, he founded the company Homeportal along with several others. The idea was to develop services for networked devices in homes. At that time, neither the hardware nor the internet was powerful enough for that idea to take off. It's perhaps only now, almost 20 years later, that similar services are starting to see the light of day.

What came after Homeportal?

"I kind of dropped out of the IT industry and started thinking about all sorts of other things. Among other things, I attended the preliminary department of the Reykjavik School of Visual Arts during the winter of 2006-2007 and enrolled in a portrait course in the United States after that. I found it most enjoyable of all to paint portraits of people, and I have several of those in my possession. I was also the chairman of CCP for some time and sat on the board of CCP for seven years. I participated in getting the Verne data center in Reykjanesbær up and running. Alongside these projects, I was also investing in some startup companies."

Do you see significant contrasts between visual art and software development, or can programming be seen as a form of artistic expression?

"Yes, I discovered through all of this that there are more similarities between visual art and programming than many people think. Programming is in large part about patterns, creating patterns and recognizing patterns. Reusing patterns, creating flexible patterns, breaking them up, repeating them, and changing them.

Programs can be ugly or beautiful. There's a lot of aesthetics in programming, and when you read or write code, it's very important in my opinion to have a certain sense of the beauty or the aesthetics of the code. Beautiful code is simply better than ugly code, it lasts longer and has fewer errors in it."

How did Netskrafl come about?

"I suppose I can thank my daughter for that. She graduated with a degree in computer science from the University of Iceland in 2014, and I felt somewhat uncomfortable not being able to converse with her about the latest technology. I decided I needed to brush up on my programming skills to keep up with her. Subsequently, I familiarized myself with the Python programming language and web services, and decided to create an Icelandic Scrabble game for the internet to practice. It ended up becoming Netskrafl, which now has about 25,000 registered users today."

While working on Netskrafl Vilhjálmur became familiar with the Database of Modern Icelandic Inflection (DMII), which is a database managed by The Árni Magnússon Institution for Icelandic Studies. The database contains information about the inflectional forms of the vast majority of common words in the modern Icelandic language and is open to everyone on the internet free of charge.

Where did the idea for Greynir come from?

"When I discovered that this database of Icelandic existed, it reminded me of an old hobby. Between 1985 and '87, I was involved in a project called Artek that created a compiler for the Ada programming language. The compiler takes in text in the respective programming language and translates it into machine code that the computer understands. I started wondering if it would be possible to make a computer 'understand' Icelandic text and analyze it using similar techniques.

Language consists of certain tree structures that underlie it, and there are specific rules that we follow, mostly unconsciously, when we speak and write. It's possible to figure out what these rules are and describe them to a computer, which then tries to find this underlying structure - this is what we call parsing text.

Greynir was actually just an experimental project. I started this without knowing whether it was possible or not; I equally expected that I would encounter a barrier somewhere and find out that it wasn't possible. But this barrier simply never appeared. It was always possible to continue and gradually do better and better. Today, Greynir manages to parse over 90% of sentences in a typical news article."

In short, how does Greynir work?

"Greynir works by reading Icelandic text, dividing it into sentences, examining each word individually, looking it up in BÍN (Database of Icelandic Inflections), and determining what it could be, for example, which word classes it might belong to. For instance, the word 'á' could be a noun (ær or á), a verb (eiga), a preposition (á borðinu), or an adverb (æfingin reynir á). Then Greynir tries to fit the words in each sentence to grammatical rules, and if successful, it can create a sentence tree. Once we have a sentence tree, we roughly know how the sentence is structured, and then we can infer who is doing what and how. We then know what the subject is in the sentence, what verbs are present, and what the objects of these verbs are. Following that, we can extract facts, statements, and information from the tree, such as people's titles, definitions of proper nouns, and various other things. This way, we can use a computer to derive meaning from a sentence and read information from text. We can also look at deviations, that is, we can see if there's something wrong with the sentence. And Greynir can catch both grammatical and spelling errors even when the phrasing is complex."

Why is Greynir spelled with 'y'?

"Originally, Greynir was supposed to be named Reynir, but that name was reconsidered when we discovered that the domain reynir.is was owned by the sports club Reynir in Sandgerði. The 'G' was then added to the front, as we think Greynir can be quite a poor thing ('grey') when tasked with analyzing complex sentences.

Where did the idea for the speech assistant app Embla come from?

"We had a simple query system on Greynir's website where it was possible to enter questions and get answers from Greynir's data, for example about people's titles and definitions of proper nouns. Sveinbjörn Þórðarson, our employee and app specialist among other things, had the brilliant idea to try to create a voice app that connected to this query system. I think he just threw together the first prototype at home, and when he showed it to the rest of us, it was immediately obvious that this was incredibly clever. Connecting voice with the language analysis system we have offers countless possibilities. Among other things it provides accessible services that benefit the general public, but also people who have difficulty using traditional screens and keyboards."

What other possibilities do you see for Greynir?

"The opportunities surrounding Greynir are numerous. Recently, we have been working on language review, i.e. a proofreading tool for text. Greynir's uniqueness in this context is that it can find and provide guidance to the user regarding grammatical errors, not just spelling mistakes. As a result, it can provide much more detailed suggestions than other language review tools for Icelandic to date. We are also working on machine translation, that is, automatically translating text between Icelandic and other languages, primarily English to start with. We also envision various possibilities related to artificial intelligence, for example, being able to answer questions from text and summarize text."

Greynir is not only unique in Iceland but also has a certain uniqueness worldwide, at least in terms of the underlying technology. There are not many tools in the world that fully parse text into trees using methods comparable to Greynir. Indeed, the parsing technology in question is relatively new, having first appeared in scientific journals about ten years ago.

Is this something that you envision could be utilized for other languages?

"Yes, the basic technology could be used for other languages, and you could almost say that since the technology works for Icelandic, it should work for most other languages. It could be exciting in the future to explore whether the same technology could be suitable for, for example, smaller languages in neighboring countries."

Have you always had an interest in and passion for the Icelandic language?

"Yes, I think it's safe to say that. I read a lot as a child and always enjoyed Icelandic as a subject in school. I had very good teachers in elementary school, which has been very beneficial for me. For example, Einar Magnússon taught me Icelandic at Hagaskóli; from him, I learned the fundamentals of Icelandic grammar, and I have benefited from that knowledge ever since. I find Icelandic to be an interesting language; there's so much to it, and a rich history is embedded in the language. When you try to capture Icelandic in rules, you discover that there are all kinds of exceptions to the rules and fixed phrases that are old and can be traced back to Njál's Saga, the Prose Edda, and other ancient texts."

What motivates this extensive work on Greynir?

"I felt a sense of duty when I realized that Icelandic could be left behind in this new digital world. It matters that Icelandic remains competitive in digital form. I followed the development of language technology in other languages and saw how neural networks and artificial intelligence were being used to create all kinds of impressive services based on large amounts of text and speech. However, comparable services were nowhere on the horizon for Icelandic. And then the question was, what consequences would it have if people couldn't use Icelandic to talk to their devices? I realized that those who are familiar with both the technology and the Icelandic language shouldn't step aside, but rather needed to take the lead and tackle this project. Fortunately, I wasn't alone in these concerns, and a lot of good preparatory and analytical work had already been done. For example, at the Árni Magnússon Institute and Almannarómur, the Centre for Icelandic Language Technology, as well as by our tireless pioneers in this field, including Eiríkur Rögnvaldsson, Sigrún Helgadóttir, and Kristín Bjarnadóttir, to name just a few of many. So it was both easy and enjoyable to join the group."

Many people are concerned about the handling of personal information. What is Miðeind's policy on these matters?

"We share these concerns and fully understand them. A lot of Miðeind's software is open-source, which means that anyone can go to a specific place on the web (https://github.com/mideind) and simply examine our programs. This way, anyone can see what we're doing and verify what we say. It also means that programmers and other interested people who want to suggest additions or changes to Miðeind's software can send us their proposals. We gratefully accept such assistance.

Regarding Embla, we have a clear and easily understandable privacy policy/procedure. We do not collect personal information in any direct way, as there's no need to register anywhere to use Embla. However, it is possible to link the queries we receive to the device that sends them, and then it might be possible, through indirect means, to connect them to the individuals behind the devices. Embla's default setting is to send location with a query because some answers Embla gives are based on it. However, it's easy to turn off this option, so Embla doesn't send location information, but then it also reduces the number of questions she can answer. More importantly, the user can, at any time, delete all data that has come from their smart device using a dedicated button in the Embla app. This way, the user always remains in control of the data."

Vilhjálmur says that the team is small but powerful beyond its size. The company will continue to work for the benefit of the Icelandic language in an ever-changing technological world where rapid development is ahead, including in the field of artificial intelligence. 'We may never become Google, Amazon, or Microsoft, but what we do will hopefully help keep these big companies on their toes, so that they take Icelandic seriously and include it in their products and services. If they don't do it, then we will.'