Packt+ | Advance your knowledge in tech

You're reading from Machine Learning With Go Implement Regression, Classification, Clustering, Time-series Models, Neural Networks, and More using the Go Programming Language

Product type Paperback

Published in Sep 2017

Publisher Packt

ISBN-13 9781785882104

Length 304 pages

Edition 1st Edition

Languages

Concepts

Machine Learning

Author (1):

Joseph Langstaff Whitenack

View More author details

Table of Contents (17) Chapters

Title Page

Credits

About the Author

About the Reviewers

www.PacktPub.com

Customer Feedback

Preface

1. Gathering and Organizing Data

2. Matrices, Probability, and Statistics FREE CHAPTER

3. Evaluation and Validation

4. Regression

5. Classification

6. Clustering

7. Time Series and Anomaly Detection

8. Neural Networks and Deep Learning

9. Deploying and Distributing Analyses and Models

10. Algorithms/Techniques Related to Machine Learning

Caching

Sometimes, our machine learning algorithms will be trained by and/or given input for prediction via data from external sources (for example, APIs), that is, data that isn't local to the application running our modeling or analysis. Further, we might have various sets of data that are being accessed frequently, may be accessed again soon, or may need to be made available while the application is running.

In at least some of these cases, it might make sense to cache data in memory or embed the data locally where the application is running. For example, if you are reaching out to a government API (typically having high latency) for census data frequently, you may consider maintaining a local or in-memory cache of the census data being used so that you can avoid constantly reaching out to the API.

Caching data in memory

To cache a series of values in memory, we will use github.com/patrickmn/go-cache. With this package, we can create an in-memory cache of keys and corresponding values. We can even specify things, such as the time to live, in the cache for specific key-value pairs.

To create a new in-memory cache and set a key-value pair in the cache, we do the following:

// Create a cache with a default expiration time of 5 minutes, and which
// purges expired items every 30 seconds
c := cache.New(5*time.Minute, 30*time.Second)

// Put a key and value into the cache.
c.Set("mykey", "myvalue", cache.DefaultExpiration)

To then retrieve the value for mykey out of the cache, we just need to use the Get method:

v, found := c.Get("mykey")
if found {
    fmt.Printf("key: mykey, value: %s\n", v)
}

Caching data locally on disk

The caching we just saw is in memory. That is, the cached data exists and is accessible while your application is running, but as soon as your application exits, your data disappears. In some cases, you may want your cached data to stick around when your application restarts or exits. You may also want to back up your cache such that you don't have to start applications from scratch without a cache of relevant data.

In these scenarios, you may consider using a local, embedded cache, such as github.com/boltdb/bolt. BoltDB, as it is referred to, is a very popular project for these sorts of applications, and basically consists of a local key-value store. To initialize one of these local key-value stores, do the following:

// Open an embedded.db data file in your current directory.
// It will be created if it doesn't exist.
db, err := bolt.Open("embedded.db", 0600, nil)
if err != nil {
    log.Fatal(err)
}
defer db.Close()

// Create a "bucket" in the boltdb file for our data.
if err := db.Update(func(tx *bolt.Tx) error {
    _, err := tx.CreateBucket([]byte("MyBucket"))
    if err != nil {
        return fmt.Errorf("create bucket: %s", err)
    }
    return nil
}); err != nil {
    log.Fatal(err)
}

You can, of course, have multiple different buckets of data in your BoltDB and use a filename other than embedded.db.

Next, let's say you had a map of string values in memory that you need to cache in BoltDB. To do this, you would range over the keys and values in the map, updating your BoltDB:

// Put the map keys and values into the BoltDB file.
if err := db.Update(func(tx *bolt.Tx) error {
    b := tx.Bucket([]byte("MyBucket"))
    err := b.Put([]byte("mykey"), []byte("myvalue"))
    return err
}); err != nil {
    log.Fatal(err)
}

Then, to get values out of BoltDB, you can view your data:

// Output the keys and values in the embedded
// BoltDB file to standard out.
if err := db.View(func(tx *bolt.Tx) error {
    b := tx.Bucket([]byte("MyBucket"))
    c := b.Cursor()
    for k, v := c.First(); k != nil; k, v = c.Next() {
        fmt.Printf("key: %s, value: %s\n", k, v)
    }
    return nil
}); err != nil {
    log.Fatal(err)
}