Posts

Showing posts from November, 2019

Uber's Michelangelo vs. Netflix's Metaflow

  Uber's Michelangelo vs. Netflix's Metaflow Michelangelo Pain point Without michelangelo, each team at uber that uses ML (that’s all of them - every interaction with the ride or eats app involves ML) would need to build their own data pipelines, feature stores, training clusters, model storage, etc.  It would take each team copious amounts of time to maintain and improve their systems, and common patterns/best practices would be hard to learn.  In addition, the highest priority use cases (business critical, e.g. rider/driver matching) would themselves need to ensure they have enough compute/storage/engineering resources to operate (outages, scale peaks, etc.), which would results in organizational complexity and constant prioritization battles between managers/directors/etc. Solution Michelangelo provides a single platform that makes the most common and most business critical ML use cases simple and intuitive for builders to use, while still allowing self-serve extensibi...

Redis!...Huh? What ISN'T it good for?

Image
Redis is an in-memory key-value data store that allows you store your actual data structures rather than having a mapping layer between your application and your storage.  Support exists for any data type you'd need including lists, sets and hashes/maps. It's in-memory but also has options to push to disk - you can push to disk on every write with a huge performance cost, or at some regular interval.  Writes can be configured to happen via an append-only log, which makes them lightning fast. Pushing to disk every 1 second has comparable performance to never pushing to disk at all. Redis supports replication in a few different ways.  By default it's asynchronous, but can be configured to be synchronous for safety.  Combined with append-only logging on every write, you can have 100% consistency of your data on any successful write. Redis Cluster allows automatic sharding and handling of many different failure scenarios, so if a small number of the hosts in your c...

Don't NOT Repeat Yourself!

Image
Sometimes duplicate code is good!! ...what?  What do you mean?  You're 'on to me'? .... Ok ok ok hold on, just hear me out! Picture it: you're writing unit tests.  Headphones on, hoody blowing in the cold wind from your AC unit in your dark apartment. You've written all your tests, they're passing and you're feeling great.  You refactor.  The tests have a lot of duplicate setup code, so having the DRY sense of humor you've got, you mop up.  Get it all lookin' fine and tidy.  Common methods for all the setup, some parameters to handle the different configurations of the unit tests. Freshhhhhhhhhhhh :D You push your code to test env - it breaks - OH SHEEIT.  You missed a few edge cases. No prob, no prob! Just add a few unit tests, slip in an if-else here and there in your production code, and you can get your changes in and make it to office before the 2pm happy hour! Nice - easy peasy. But WAIT.  The edge cases you missed ...

Dependency Injection Hell-o World!!

Image
When new, wide-eyed engineers first start out writing maintainable, readable, extensible {insert-everything-good}ible software, they will quickly stumble upon the concepts of Dependency Injection (DI) and interfaces.  Combined with interface programming, DI prescribes software engineers to inject dependencies into a class via a constructor or setter method. For example, you might inject 2 operands '1' and '2' into an instance of the Add class: class Add { // ... Add(IntClass op1, IntClass op2) { this.op1 = op1; this.op2 = op2; } // ... }; IntClass is an interface, and Add doesn't care what the implementation of it is. It just cares that it exposes the methods that it wants: class Add { // ... int execute() { return this.op1.getVal() + this.op2.getVal(); } // ... }; Add cares that IntClass exposes getVal() which should return an int, but doesn't care how it's implemented. Now if you want to write a unit tes...