You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
Mura Li 9591185c8f Upgrade blevesearch to v0.8.1 (#9177) 4 lat temu
..
.drone.yml Upgrade blevesearch to v0.8.1 (#9177) 4 lat temu
.gitignore Use Go1.11 module (#5743) 5 lat temu
.gitmodules Use Go1.11 module (#5743) 5 lat temu
.travis.yml Upgrade blevesearch to v0.8.1 (#9177) 4 lat temu
AUTHORS Upgrade blevesearch to v0.8.1 (#9177) 4 lat temu
CONTRIBUTORS Upgrade blevesearch to v0.8.1 (#9177) 4 lat temu
LICENSE Upgrade blevesearch to v0.8.1 (#9177) 4 lat temu
LICENSE-2.0.txt Update to last common bleve (#3986) 6 lat temu
Makefile Upgrade blevesearch to v0.8.1 (#9177) 4 lat temu
README.md Upgrade blevesearch to v0.8.1 (#9177) 4 lat temu
arraycontainer.go Upgrade blevesearch to v0.8.1 (#9177) 4 lat temu
arraycontainer_gen.go Upgrade blevesearch to v0.8.1 (#9177) 4 lat temu
bitmapcontainer.go Upgrade blevesearch to v0.8.1 (#9177) 4 lat temu
bitmapcontainer_gen.go Upgrade blevesearch to v0.8.1 (#9177) 4 lat temu
byte_input.go Upgrade blevesearch to v0.8.1 (#9177) 4 lat temu
clz.go Upgrade blevesearch to v0.8.1 (#9177) 4 lat temu
clz_compat.go Upgrade blevesearch to v0.8.1 (#9177) 4 lat temu
ctz.go Update to last common bleve (#3986) 6 lat temu
ctz_compat.go Update to last common bleve (#3986) 6 lat temu
fastaggregation.go Update to last common bleve (#3986) 6 lat temu
go.mod Upgrade blevesearch to v0.8.1 (#9177) 4 lat temu
go.sum Upgrade blevesearch to v0.8.1 (#9177) 4 lat temu
manyiterator.go Upgrade blevesearch to v0.8.1 (#9177) 4 lat temu
parallel.go Upgrade blevesearch to v0.8.1 (#9177) 4 lat temu
popcnt.go Update to last common bleve (#3986) 6 lat temu
popcnt_amd64.s Update to last common bleve (#3986) 6 lat temu
popcnt_asm.go Update to last common bleve (#3986) 6 lat temu
popcnt_compat.go Update to last common bleve (#3986) 6 lat temu
popcnt_generic.go Update to last common bleve (#3986) 6 lat temu
popcnt_slices.go Update to last common bleve (#3986) 6 lat temu
priorityqueue.go Update to last common bleve (#3986) 6 lat temu
roaring.go Upgrade blevesearch to v0.8.1 (#9177) 4 lat temu
roaringarray.go Upgrade blevesearch to v0.8.1 (#9177) 4 lat temu
roaringarray_gen.go Upgrade blevesearch to v0.8.1 (#9177) 4 lat temu
runcontainer.go Upgrade blevesearch to v0.8.1 (#9177) 4 lat temu
runcontainer_gen.go Upgrade blevesearch to v0.8.1 (#9177) 4 lat temu
serialization.go Upgrade blevesearch to v0.8.1 (#9177) 4 lat temu
serialization_generic.go Upgrade blevesearch to v0.8.1 (#9177) 4 lat temu
serialization_littleendian.go Upgrade blevesearch to v0.8.1 (#9177) 4 lat temu
serializationfuzz.go Update to last common bleve (#3986) 6 lat temu
setutil.go Update to last common bleve (#3986) 6 lat temu
shortiterator.go Upgrade blevesearch to v0.8.1 (#9177) 4 lat temu
smat.go Update to last common bleve (#3986) 6 lat temu
util.go Upgrade blevesearch to v0.8.1 (#9177) 4 lat temu

README.md

roaring Build Status Coverage Status GoDoc Go Report Card

Build Status

This is a go version of the Roaring bitmap data structure.

Roaring bitmaps are used by several major systems such as Apache Lucene and derivative systems such as Solr and Elasticsearch, Apache Druid (Incubating), LinkedIn Pinot, Netflix Atlas, Apache Spark, OpenSearchServer, Cloud Torrent, Whoosh, Pilosa, Microsoft Visual Studio Team Services (VSTS), and eBay’s Apache Kylin.

Roaring bitmaps are found to work well in many important applications:

Use Roaring for bitmap compression whenever possible. Do not use other bitmap compression methods (Wang et al., SIGMOD 2017)

The roaring Go library is used by

This library is used in production in several systems, it is part of the Awesome Go collection.

There are also Java and C/C++ versions. The Java, C, C++ and Go version are binary compatible: e.g, you can save bitmaps from a Java program and load them back in Go, and vice versa. We have a format specification.

This code is licensed under Apache License, Version 2.0 (ASL2.0).

Copyright 2016-… by the authors.

References

  • Daniel Lemire, Owen Kaser, Nathan Kurz, Luca Deri, Chris O’Hara, François Saint-Jacques, Gregory Ssi-Yan-Kai, Roaring Bitmaps: Implementation of an Optimized Software Library, Software: Practice and Experience 48 (4), 2018 arXiv:1709.07821
  • Samy Chambi, Daniel Lemire, Owen Kaser, Robert Godin, Better bitmap performance with Roaring bitmaps, Software: Practice and Experience 46 (5), 2016. http://arxiv.org/abs/1402.6407 This paper used data from http://lemire.me/data/realroaring2014.html
  • Daniel Lemire, Gregory Ssi-Yan-Kai, Owen Kaser, Consistently faster and smaller compressed bitmaps with Roaring, Software: Practice and Experience 46 (11), 2016. http://arxiv.org/abs/1603.06549

Dependencies

Dependencies are fetched automatically by giving the -t flag to go get.

they include

  • github.com/willf/bitset
  • github.com/mschoch/smat
  • github.com/glycerine/go-unsnap-stream
  • github.com/philhofer/fwd
  • github.com/jtolds/gls

Note that the smat library requires Go 1.6 or better.

Installation

  • go get -t github.com/RoaringBitmap/roaring

Example

Here is a simplified but complete example:

package main

import (
    "fmt"
    "github.com/RoaringBitmap/roaring"
    "bytes"
)


func main() {
    // example inspired by https://github.com/fzandona/goroar
    fmt.Println("==roaring==")
    rb1 := roaring.BitmapOf(1, 2, 3, 4, 5, 100, 1000)
    fmt.Println(rb1.String())

    rb2 := roaring.BitmapOf(3, 4, 1000)
    fmt.Println(rb2.String())

    rb3 := roaring.New()
    fmt.Println(rb3.String())

    fmt.Println("Cardinality: ", rb1.GetCardinality())

    fmt.Println("Contains 3? ", rb1.Contains(3))

    rb1.And(rb2)

    rb3.Add(1)
    rb3.Add(5)

    rb3.Or(rb1)

    // computes union of the three bitmaps in parallel using 4 workers  
    roaring.ParOr(4, rb1, rb2, rb3)
    // computes intersection of the three bitmaps in parallel using 4 workers  
    roaring.ParAnd(4, rb1, rb2, rb3)


    // prints 1, 3, 4, 5, 1000
    i := rb3.Iterator()
    for i.HasNext() {
        fmt.Println(i.Next())
    }
    fmt.Println()

    // next we include an example of serialization
    buf := new(bytes.Buffer)
    rb1.WriteTo(buf) // we omit error handling
    newrb:= roaring.New()
    newrb.ReadFrom(buf)
    if rb1.Equals(newrb) {
    	fmt.Println("I wrote the content to a byte stream and read it back.")
    }
    // you can iterate over bitmaps using ReverseIterator(), Iterator, ManyIterator()
}

If you wish to use serialization and handle errors, you might want to consider the following sample of code:

	rb := BitmapOf(1, 2, 3, 4, 5, 100, 1000)
	buf := new(bytes.Buffer)
	size,err:=rb.WriteTo(buf)
	if err != nil {
		t.Errorf("Failed writing")
	}
	newrb:= New()
	size,err=newrb.ReadFrom(buf)
	if err != nil {
		t.Errorf("Failed reading")
	}
	if ! rb.Equals(newrb) {
		t.Errorf("Cannot retrieve serialized version")
	}

Given N integers in [0,x), then the serialized size in bytes of a Roaring bitmap should never exceed this bound:

8 + 9 * ((long)x+65535)/65536 + 2 * N

That is, given a fixed overhead for the universe size (x), Roaring bitmaps never use more than 2 bytes per integer. You can call BoundSerializedSizeInBytes for a more precise estimate.

Documentation

Current documentation is available at http://godoc.org/github.com/RoaringBitmap/roaring

Goroutine safety

In general, it should not generally be considered safe to access the same bitmaps using different goroutines--they are left unsynchronized for performance. Should you want to access a Bitmap from more than one goroutine, you should provide synchronization. Typically this is done by using channels to pass the *Bitmap around (in Go style; so there is only ever one owner), or by using sync.Mutex to serialize operations on Bitmaps.

Coverage

We test our software. For a report on our test coverage, see

https://coveralls.io/github/RoaringBitmap/roaring?branch=master

Benchmark

Type

     go test -bench Benchmark -run -

To run benchmarks on Real Roaring Datasets run the following:

go get github.com/RoaringBitmap/real-roaring-datasets
BENCH_REAL_DATA=1 go test -bench BenchmarkRealData -run -

Iterative use

You can use roaring with gore:

  • go get -u github.com/motemen/gore
  • Make sure that $GOPATH/bin is in your $PATH.
  • go get github.com/RoaringBitmap/roaring
$ gore
gore version 0.2.6  :help for help
gore> :import github.com/RoaringBitmap/roaring
gore> x:=roaring.New()
gore> x.Add(1)
gore> x.String()
"{1}"

Fuzzy testing

You can help us test further the library with fuzzy testing:

     go get github.com/dvyukov/go-fuzz/go-fuzz
     go get github.com/dvyukov/go-fuzz/go-fuzz-build
     go test -tags=gofuzz -run=TestGenerateSmatCorpus
     go-fuzz-build github.com/RoaringBitmap/roaring
     go-fuzz -bin=./roaring-fuzz.zip -workdir=workdir/ -timeout=200

Let it run, and if the # of crashers is > 0, check out the reports in the workdir where you should be able to find the panic goroutine stack traces.

Alternative in Go

There is a Go version wrapping the C/C++ implementation https://github.com/RoaringBitmap/gocroaring

For an alternative implementation in Go, see https://github.com/fzandona/goroar The two versions were written independently.

Mailing list/discussion group

https://groups.google.com/forum/#!forum/roaring-bitmaps