Better and cheaper speech transcription with Hreimur

It has only been a short while since we gave Málstaður a complete make-over but the upgrades keep coming! This time we have turned our special attention to Hreimur, as well as designing new icons for all of the tools available on Málstaður. We have also significantly lowered the price for speech transcription.

Transcription history

Users can now access older speech transcription results and recordings in Hreimur, as well as run multiple transcription tasks at a time. This means users no longer have to worry about their results getting lost, and they can maintain up to 10 tasks simultaneously.

You can also get a summary and meeting minutes based on the speech recognition results in the sidebar on the right, which can then be copied, moved to the Málfríður editor, or downloaded.

For those who do not wish to save any information regarding previous speech transcription tasks, this feature can be disabled in the user settings, which can be accessed by clicking on the username at the bottom of the left-hand sidebar.

Recording inside editor

We are very committed to the accessibility of all our products, and now Hreimur's speech transcription functionality can be accessed in a very simple and accessible way inside the Málfríður and Erlendur editors. Instead of uploading a video or audio recording to the Hreimur interface, users now have the option to speak directly into the Málfríður editor, using the dedicated button, and access the output in the same location. This is especially useful for those who primarily use Hreimur to dictate emails or other messages, or to draft essays.

Recording starts as soon as you click the microphone in the menu below the editor and after you have approved access to the microphone.

When the recording and speech recognition have finished, the result is displayed, which can then be inserted into the editor.

Speaker identification

Since the last major update, Hreimur has offered optional speaker diarization, i.e., a feature that divides up the speech transcription output based on who is speaking at any given time. Different speakers have been labeled as “Speaker 1,” “Speaker 2,” etc., but these names can be changed in the interface; “Speaker 1” can become “Jón Jónsson,” and so on.

We are now introducing a new and improved feature, speaker identification, where the model itself determines the identity of the speakers based on information found in the recorded speech, such as when one speaker addresses or refers to another by name, or when speakers introduce themselves by name. Below is an example of speaker identification from a recording of a parliamentary session of the Althing.

Better speech recognition model

We are constantly iterating on and improving the model that powers Hreimur's speech recognition and we evaluate our model on a variety of benchmarks. These benchmarks are challenging and include difficult audio conditions, a diverse group of speakers, and varying recording quality. We monitor three key metrics:

Word Error Rate (WER): The percentage of words the model transcribes incorrectly (lower is better).
Character Error Rate (CER): The percentage of characters the model transcribes incorrectly, i.e., it captures more subtle errors (lower is better).
chrF++: A scale from 0–100 that measures how closely the output matches a human reference (higher is better).

Since our baseline in August 2025, the model's raw output (i.e., without any automatic post-processing) has shown a 26% relative reduction in word errors and a 30% relative reduction in character errors, along with an increase in chrF++ from 88.9 to 90.9. This progress is also maintained after the post-processing that is performed on Hreimur's output before it is displayed to users: word errors have decreased by 22% and character errors by 29% compared to the same baseline.

It is interesting to note here that a large part of the increase we see in the output's accuracy comes from the model itself rather than from post-processing. The gap between the raw and post-processed output has narrowed with each training iteration. This indicates that the model is learning to produce cleaner transcriptions straight out of the box – which reduces our need for corrections in later steps.

New icon

Perhaps the most obvious change to Hreimur is the new icon that identifies it in the Málstaður interface. All the tools on Málstaður have now received new icons that better distinguish them from one another and are more descriptive of the functionality each tool offers. The new Hreimur icon is inspired by sound waves as they often appear to us in various recording and audio processing tools..

Better rates

The icing on the cake is that we have now updated our pricing and subscription benefits for speech transcription to make it more accessible and cheaper starting in April:

Fixed Subscription includes twice as much speech transcription per month, or 10 hours instead of 5 hours.
Flexible Subscription offers one free hour for the 1,000 ISK credit that comes with each user, and usage beyond the credit costs half as much, or 1,000 ISK/hour instead of 2,000 ISK/hour.

Coming soon

We are nowhere near done developing and improving Hreimur. We are currently working on a solution that offers real-time speech recognition and plan to offer it on Málstaður later this year. In addition, we will continue to update and improve the Hreimur interface to make it even easier for users to work with the output efficiently.

Sign up for our mailing list!

If you want to follow future projects at Miðeind, we can let you know when there is something new to report.

Post Tags:

Products