A Herd Of Rabbits Part 2: RabbitMQ Data Pipelines

2020 Mar31
R

abbitMQ is a powerful message borker allowing engineers to implement complex messaging topologies with relative ease. At the day job we used RabbitMQ as the backbone of our real time data infrastructure. In the previous post we setup a simple PostgreSQL trigger to send change capture messages to a RabbitMQ exchange. Conceptually, this is where we left off:

In this early stage, we basically have a fire-hose that we can selectively tap into. But we have no way to control the flow of data.

To recap a bit before we get too deep, we had a simple and manual way of handling real time operations. Effectively, we just baked all of the logic in the specific application code path.

Read More

A Herd Of Rabbits Part 1: Postgres Change Capture

2020 Mar27

P
ostgres is no longer "just a database." It has become a data integration and distribution platform. It has hooks for integrating custom data types, data formats, remote data store integration, remote index support, a rich extension ecosystem, cascading logical replication facilities. It is practically an application server. A proverbial swiss army knife to say the least.

At the day job, we use postgres as the primary database. As communication platform (chat) we do a good deal of real-time whiz-bangery. Being that we are an early stage start-up, we try to follow the keep it simple, stupid (k.i.s.s.) approach. Do the simplest thing possible until it isn't simple anymore.  The simple approach to real time was a

Read More

Fun With Postgres: Custom Constraints

2019 Sep29

P
ostgres comes with a rich set of tool to help developers maintain data integrity. Many of them come in the form of Constraints. Foreign Key, Not Null, Exclusion, and Check constraint allow developers to off load what would usually be a lot of application code to the database server with very litte effort.

Between the Foreign key, which verifies values in another table, and the Check constraint, which verifies values in a specific column, you can typically accomplish just about everything you need to do rather easily. The main problem the you'll run into is that these kinds of constraints are restricted to a single column on a table. Additionally they only apply the the current value in that

Read More

Flexible Schemas with PostgreSQL and Elasticsearch

2018 Dec26

R
elational Databases typically make use of a rigid schema - predefined tables containing typed columns allowing for a rich set of functionality that would otherwise be impossible. It is both a major strength as well as a major weakness. On one hand strong typing allows databases to expose a rich set of operators, functions and functionality for each of the types. For postgres, this usually presents itself in the form of column types sql syntax to interact with them. On the other hand it means that all of the data in the table is uniform and deviations or alterations are rather difficult to do.

At the day job, I am in the process of migrating a number of applications

Read More
filed under:  postgres elasticsearch json

Exact Match OR Null Queries With Elasticsearch

2017 Dec19
E

lasticsearch is an extremely powerful searchable document store. The more you use it, the more you learn about the more realize how deep the rabbit hole of possibility goes. Except for when it comes null values. The Achilles heel of elasticsearch. What it really boils down to is elasticsearch doesn't index null values or fields that are missing from a document. Even if you have set up an index mapping and told elasticsearch about your field.

As you might think it can get a little tricky searching document for a field value or where that field is null / missing - especially when combined with other field queries. For example lets say I have some documents that looks something like

Read More
filed under:  elasticsearch

SQL Like Search Queries With Elasticsearch

2017 Nov11
E

lasticsearch is an amazing piece of technology. Built on top of luecine it offers all of he incredible search facilities that you'd expect from a full featured search. What makes elasticsearch so powerful, however, is the fact that it stores the actual data that was originally index as JSON documents. Basically, it is a Full Text Search Database more so than a search engine. This allows elastic search to do things that other search engines can't do like aggregations, scripted queries, multi-query searches, etc; All in addition to the expected searching capabilities like suggestions, spelling corrections, faceting, and so on. For these reasons people are using elasticsearch as the primary data store for massive amounts of data.

One thing

Read More
filed under:  elasticsearch sql

Live Coding On Twitch

2017 Sep14
G

oing to start doing some live coding on twitch! I've been in the open source space for a long while and it seems fitting that this is the next step. I don't really have a schedule in mind or much else lined out. Other than - It's probably happening

Check out my channel on twitch. I'll do my best to send out a heads up when I plan on doing something. I'll probably work on Skyring, Tastypie or maybe dust off some stuff from my megadoomer collection of things.

I'm Twitchin'!

Read More
filed under:  twitch streaming live coding

Exactly Once Execution In A Distributed System

2017 Sep04
S

kyring is is a distributed system for managing timers, or delayed execution similar to `setTimeout` in javascript. The difference being that it is handled in a reliable and fault tolerant way. setTimeout in javascript is transient. If the running application is restarted or crashes, any pending timers are lost forever. The one guarantee that skyring provides is that a timer will execute after the specified delay, and that it only executes once. Exactly once is an interesting challenge in distributed systems, and Skyring makes use of a number of mechanisms at the node level to achieve this. From a high level, this is what the behavior on individual nodes looks like.

Skyring Node Behavior

Shared Nothing

Skyring follows the shared nothing mantra

Read More

Custom Transports For Skyring

2017 May29
S

kyring is a distributed system for managing timers. When a timer lapses, a message is delivered to destination that you have defined. *How* that message is delivered is configurable. Out of the box, Skyring comes with an `HTTP` transport, and there is an official package enabling tcp delivery of messages with connection pooling. They are pretty easy to write, and you can use any of the tools you are currently used to using.

STDOUT Transport

To illustrate the process, we're going to make a simple transport handler to write the data to stdout. Basically, speaking a transport is just a node.js module that exports a named function

Module [ˈmäjo͞ol] -n., --noun

any of a number of distinct

Read More
filed under:  zmq skyring timers node.js

Getting Started With Skyring Distributed Timers

2017 May01
T

he very idea of distributed timers is complex. Conceptually is full of race conditions and edge cases. Skyring for Node.js boils the problems space down to a simple to use library and API for building scalable service that need to perform time sensitive, actions. That is a mouthful - Think An email gateway, a web-hook service, auto-dialers for telephony systems. Or in the most practical sense, anytime you might need functionality like setTimeout but needs to survive restarts / crashes; Or are using a language that doesn't support non-blocking timers. Skyring fills that gap, and it is easy to use. We can get something going in less that 20 lines of code.

To start, we just install the skyring

Read More
filed under:  skyring timers node