Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I personally wouldn't do any of these for an average project. They seem like micro-optimizations. But this is Missing the largest Go optimizations I can think of and I would do. String vs byte[] and memory allocation.

A lot of Go programs spend a lot of time converting between byte[] and string. But the conversion is actually really slow and keeping your string data as byte arrays is much faster and since many go libraries can work with either, the conversation is not necessary.

Also, memory allocations and garbage collection are not free. If you have a hot path that gets hit every request or in a loop, it can save a lot of time reusing the same slices assuming you can do so without introducing bugs and/or multi-threading issues.



Dumb question, but can you give an example of using []byte over string?

I've got a project where strings aren't exactly crucial, but we're storing a ton of them in memory. We store them as string and, frankly, rarely access most of them - though we do do some compares of strings on the hot path.

Using []byte sounds interesting, but troublesome at the same time. Namely the fact that byte is useless for reading characters from, as you'd eventually have to get runes from it anyway no?

Thoughts?


The key takeway is there is a performance cost to converting between []byte and string and minimizing the number of conversions would be a reasonably low-hanging fruit in looking for performance optimizations.


>Using []byte sounds interesting, but troublesome at the same time. Namely the fact that byte is useless for reading characters from, as you'd eventually have to get runes from it anyway no?

Tons of string manipulation just needs to split, etc on \n, ., ;, - which don't need runes.

Besides that, string comparisons don't need runes either (assuming what you're comparing is normalized to the same bytes, which if you do the same input processing to anything you store and to the strings you query on, it would be).


[]byte is crucial if you need to frequently modify the contents. The classic example is building a string with a loop that appends a new character on each iteration. Using += to concatenate strings inside such a loop will be slow; the Right Thing To Do is to preallocate a []byte with sufficient capacity, append to it in a loop, and then (if necessary) convert to a string only at the end.


Isn't there a premade library that will abstract these optimizations away? e.g. a rope library. Looks like there's a few libraries out there that implement this data structure. (https://en.wikipedia.org/wiki/Rope_(data_structure))


Doesn't Go have a mutable char/rune array?


Yes, it's just []rune, which has just a bit of special sauce in the runtime that allows you to convert it directly to a string [1], and some functions that convert UTF-8 to and from it. However rune is a 32-bit number sufficient for storing a Unicode code point. Generally you're dealing in UTF-8 and prefer to just pass that through code. I think I've only ever use a []rune once, for when I was doing some very heavy-duty Unicode-aware code, and I still ended up refactoring it into using UTF-8 directly in the end.

[1]: https://play.golang.org/p/5DEzw85J5Ob Note this is a conversion; the string is UTF-8 and the runes are 32-bit ints, so this creates a new string.


Yep, GC allocation is not the only path to use memory in Go.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: