Miðeind for 10 years: Bridging the gap between you and technology

Miðeind is currently celebrating its tenth anniversary. On this occasion, we are publishing a series of articles that look back on the journey and enumerate the progress that has been made in language technology for Icelandic over the past decade.


Much has changed in the nearly nine years that I have worked at Miðeind, but I have had the good fortune to be involved, in one way or another, with almost all of the company's projects.

Looking back, certain threads can be identified that have woven through the work – user value, accessibility, and focus on products. These threads matter, not just for us but for those who use our products.

Size – Great oaks from little acorns grow

The first thing to mention is that during this decade, Miðeind has gone from having one employee to becoming the robust startup it is today, with 16 employees. Miðeind has never been the conventional startup, as the company is purpose-driven rather than profit-driven. Consequently, Miðeind has tremendously increased its capacity to handle projects.

At Miðeind, a tight-knit and powerful group of people with diverse backgrounds and a burning passion for language technology and artificial intelligence work together. It brings together one of the largest groups of artificial intelligence experts in the entire country, and it's safe to say that all research work and technological advancements in the field are closely monitored.

Technology – Roll with the punches

Technology is developing at an incredibly rapid pace, and we have had to evolve with it. There have been instances where, from the time we applied for a project until funding was secured, almost all the premises had changed. This requires us to be flexible and quick to adapt, which has proven easier with a strong team and good overview.

When we started, rule-based systems were at the forefront. They require a lot of manual work and time, as rules need to be written for all system behavior. For example, Greynir, Miðeind's first solution, contains nearly 8,000 lines of hand-written grammatical rules for parsing text. Another example is Yfirlestur, Miðeind's proofreading tool based on Greynir.

Neural network models then began to gain ground, where it was possible to specially train a model to solve a specific task. This requires a large amount of data that has been labeled for that task, and is therefore not within everyone's reach. An example of such a model is Málfríður, the successor to Yfirlestur. Málfríður is trained on texts with errors tagged, including from the language technology program. Another example is Miðeind's Vélþýðing, which was specially trained on a vast amount of parallel data for Icelandic and English (and later more languages).

When the large language model GPT appeared on the scene, everything changed. Here was a model that could solve tasks without having been specifically trained for them, as it has been trained on an enormous amount of data. GPT has continued to evolve, along with other models that have sprung up. For example, Erlendur, the successor to Vélþýðing, is based on a large language model at its core, but various tricks are employed in addition to obtain more accurate and reliable translations.

All of this has happened in just a decade, and especially rapidly in the last five years.

At Miðeind, we don't claim to foresee technology, but as an indication of how Miðeind has managed to keep up with technological developments, it's worth mentioning that as early as 2016, we applied for a grant to create a large tagged and parsed corpus (GreynirCorpus), and the following year we applied for a grant to create a neural network model that would be trained on the data to learn tagging and parsing. That technology was brand new at the time.

In the same vein, in spring 2022, Miðeind facilitated a meeting between the President of Iceland and a small tech company in Silicon Valley that we thought was doing impressive and progressive things. This resulted in a collaboration between OpenAI and Miðeind, which has, among other things, led to Icelandic being the only language besides English that was specifically considered in the training of GPT-4.

We have therefore always kept ourselves at the forefront when it comes to utilizing the latest technology.

Projects – You live and learn

The projects we have applied for and undertaken in this decade are naturally colored by the technology and data available at any given time. We have always placed a major emphasis on practical application and ensuring that Icelandic is used in the tech world – how can we put what we offer into use?

We participated in the Language Technology Programme for Icelandic I, where the focus was on building data sets and developing fundamental LT solutions. At this time, there were few tools for the Icelandic language, and initially, the solutions were rule-based. The LT programme produced many good things; important data that is useful for training and evaluating models, and fundamental tools like the Tokenizer, which is still widely used. We also gained valuable experience from the language technology plan and got in touch with various stakeholder groups, gaining insight into their needs. In line with the paragraph above, it's worth mentioning that we have always placed a primary emphasis on making our software packages accessible. This includes precise and detailed documentation and publishing them on the open package repository PyPI, making it particularly easy for users to download and use them.

In the latter part of the LT programme, when the technology made it possible, we participated in various collaborative projects outside of the LT programme. The projects involved specialized training of models for specific purposes and even for specific systems. Various exciting ideas emerged that we pursued, but we quickly found that it was difficult to stretch in so many different directions. We have usually stuck to the core technology and left implementation to others, but we soon saw that this distinction didn't work; there is always some implementation work involved in such projects.

We also realized that although the technology was available, there was little awareness of it and the possibilities it brought. Additionally, it was unreasonable to expect every single company or municipality to implement it on their own, let alone the general public. The business community therefore didn't seize the opportunity as we had hoped.

We learned from these projects that to fulfill our goals, it was best to look at Icelandic society as a whole and try to address the needs of as many as possible, instead of developing specialized tools. This way, we stay as close to the basics as possible while putting the benefits of the technology into the hands of as many people as possible.

As a result, Miðeind moved in a much more product-oriented direction. However, a certain research component is always maintained, as all products are based on that. The main focus shifted to developing products to meet the needs of the general public and bridge the gap between them and the technology.

Málstaður – All roads lead to Rome

At this point, we had a good range of solutions for Icelandic, which we wanted to put into wider use. From this, the idea of Málstaður, an integrated platform for all of Miðeind's main language technology solutions, was born. Málstaður is our way of putting the technology and the benefits that can be achieved with it into the hands of the public, institutions, and companies of the country. Our focus from the beginning has helped shape Málstaður and the ideas behind it.

We have achieved diverse usage possibilities by offering all these solutions in one place, and they also communicate with each other. This way, it's simple to record a meeting, get a speech-to-text version of it with Hreimur, send it to Málfríður for summarization to get meeting minutes, and finally send the minutes to Erlendur for translation into multiple languages.

Future vision – Striking the iron while it is hot

Málstaður has received an incredibly positive reception, and it's wonderful to hear how it has changed work processes and even quality of life – that's exactly why we're doing this.

We are constantly adding functionality to Málstaður and making it more user-friendly. This includes, among other things, the translations in Erlendur, summaries in Málfríður, a help center, and videos we have created to demonstrate specific use cases and thus get closer to the user's needs. We want to help you see how Málstaður can benefit you.

In conclusion

It has been educational to develop ideas, apply for project grants, and watch technology evolve. Some projects have become products used by thousands today. Others didn't take off, but the experience from them shaped our next steps. We learned to maintain focus; not just to chase exciting things, but to keep the user and utility at the forefront. This focus has resulted in Málstaður, which is a direct reflection of our values.

We have learned to move quickly. To keep our finger on the pulse. We have also learned that it's not enough to develop a solution or to be at the forefront of technology – utilization and usefulness are everything. Technology that no one uses changes nothing.

There are many things we could have done differently, but there are also many things we did right. And there is even more that we intend to do.


Want to know more? Get the latest news on our
website or follow us on social media.

Post Tags:
Share this post: