Drivine is a graph database driver for Node.js and TypeScript implemented on top of the NestJS platform. It is designed to support multiple graph databases (simultaneously, if you wish) and to scale to hundreds or thousands of transactions per second. It allows you to meet these goals without compromising architectural integrity.
Drivine provides a sweet-spot level of abstraction, with management and object to graph mapping (OGM) features. This includes the following:
The library contains a persistence manager interface:
export interface PersistenceManager {
query<T>(spec: QuerySpecification<T>): Promise<T[]>;
getOne<T>(spec: QuerySpecification<T>): Promise<T>;
maybeGetOne<T>(spec: QuerySpecification<T>): Promise<T | undefined>;
openCursor<T>(spec: QuerySpecification<T>): Promise<Cursor<T>>;
}
It is the job of the persistence manager to obtain a connection, using database details that are registered when the library is bootstrapped, or at runtime. It will use pooling if this entails a performance benefit on the given database platform.
Repositories are a common pattern of structuring object-oriented code, in order to adhere to single responsibility principle (SRP). They logically group database operations for a particular type of entity.
Simply by using composition, the
PersistenceManager
can be used to implement repositories. Here is an example:
@Injectable()
export class HealthRepository {
constructor(@InjectPersistenceManager() readonly persistenceManager: PersistenceManager) {}
async countAllVertices(): Promise<number> {
return this.persistenceManager.getOne<number>(new QuerySpecification(`match (n) return count(n)`));
}
}
Just as repositories can be composed using a PersistenceManager
, so services can be composed using
repositories. But what about transactions? When analyzing the functional and non-functional requirements of a system, transactions fall under what is known
as a cross-cutting
concern. They are called as such because they’re required in many places. In other words, they cut across
many modules.
Requirements like these don’t fit well with pure object-oriented programming because they can compromise efforts to adhere to SRP. Imagine implementing a service method that is all about transferring funds, and then adding transaction behaviors. Now add security. And then audit. Before too long, the class that neatly represented a single role has become a mess.
Fortunately, transactional concerns can easily be modularized in TypeScript using Decorators, a kind of higher-order function that wraps the original function, in our case with transactional concerns applied.
Apply the Transactional()
decorator to transactional methods.
@Injectable()
export class RouteRepository {
constructor(
@InjectPersistenceManager() readonly persistenceManager: PersistenceManager,
@InjectCypher(__dirname, 'routesBetween') readonly routesBetween: CypherStatement) {}
@Transactional()
async findFastestBetween(start: string,destination: string): Promise<Route> {
return this.persistenceManager.getOne(
new QuerySpecification<Route>()
.withStatement(this.routesBetween)
.bind([start, destination])
.limit(1)
.transform(Route)
);
}
}
By default the decorator will start a new transaction if one does not exist. Otherwise it will participate in an existing transaction. When the outermost transactional method completes, the entire stack will be committed. Meanwhile, if an error is thrown, the transaction is rolled back.
We can now compose transactional services from one or more repositories as follows:
export class TransferService {
constructor(readonly accountRepo: AccountRepository) {}
@Transactional()
async transfer(sourceId: string, targetId: string, amount: number): Promise<void> {
const sourceAccount = await this.accountRepo.findById(sourceId);
sourceAccount.deductFunds(amount);
await this.accountRepo.update(sourceAccount);
// Throws Error! Invalid Id!
const targetAccount = await this.accountRepo.findById(targetId);
targetAccount.depositFunds(amount);
await this.accountRepo.update(targetRepo);
}
}
In the above example, if the targetId
is invalid, all data operations are rolled back. Otherwise a
commit is performed when the outermost transactional method completes. Note that we used normal error handling - implementing boiler-plate code was not required and code clearly represented a single task.
It is well and good to assemble an application where transactional services are made up from an aggregate of repositories. However, for some use-cases the volume of data is too large to be buffered in a single result set.
Fortunately the PersistenceManager
that we saw earlier provides the ability to open a Cursor<T>
, which provides two kinds of streaming capabilities.
Let’s explore. In the example below, the repository method promises to return a Cursor
for a Route
.
@Transactional()
async findRoutesBetween(start: string, destination: string): Promise<Cursor<Route>> {
return this.persistenceManager.openCursor(
new CursorSpecification<Route>()
.withStatement(this.routesBetween)
.bind([start, destination])
.batchSize(5)
.transform(Route)
);
}
Cursor<T>
implements AsyncIterable
. This means that it can be used with a 'for
await...of
' statement. During the execution of the loop, results will be pullled in batches, until the
upstream is depleted.
const cursor = await repo.asyncRoutesBetween('Cavite Island', 'NYC');
for await (const item of cursor) {
// The cursor will read more results, executing
// database reads, until the stream is consumed.
}
Besides AsyncIterable
cursors turn themselves into a Readable
stream. Why would we need this? AsyncIterable
is helpful, but it may lead to problems when a tight loop is pushing data into a stream, such as a file-stream. Even though data is pulled in batches, pushing too quickly into a stream will cause problems. The following example could potentially crash:
const cursor = await repo.asyncRoutesBetween('Cavite Island', 'NYC');
for await (const item of cursor) {
const result = fs.write(item);
}
Don't overload stream buffers - this is signified when fs.write(item)
returns
false. If you ignore this warning and continue, your application can crash. The solution is as follows:
cursor.asStream().pipe(fileStream);
await StreamUtils.untilClosed(fileStream);
Now new information will be pulled from the cursor as needed. It is the sink stream (filestream) that will coordinate this, and at the rate at which it can handle.
Drivine provides graph-to-object mapping facilities, however, as described in the introduction, due to its design goal of optimal performance it differentiates from the most typical approaches.
In a typical application that uses an ORM or OGM:
The approach can work, however when an application needs to be highly scaleable, several problems are entailed:
In the referenced article, Mr Fowler states that he is not sure why the anemic domain model anti-pattern is so predominant. In my opinion, one primary cause is that most ORM/OGM tools do not offer to inject service dependencies onto the entities. In order to exhibit richer, more expressive behavior, entities will surely collaborating services to assist in that. To support rich models, the OGM tool can offer to inject registered services, so that this entities have an opportunity to provide expressive behaviors and not just holders of data. A future version of Drivine may offer to do this.
In the meantime, whatever the cause, it still does nothing to address the first four points, so let's explore . .
In order to benefit from cleanly architected code that can scale to many thousands of transactions per second, Drivine takes the following approach:
We can need perform type-coercion of dates, numbers or enums. We can load a set of related entities into a type. Object mapping is provided by the class transformer.
export class Route {
readonly start: string;
readonly destination: string;
readonly metros: string[];
@Expose({ name: 'travel_time' })
readonly travelTime: number;
@Type(() => Photo)
readonly significantSites: Photo[];
constructor(start: string, destination: string, metros: string[], travelTime: number) {
this.start = start;
this.destination = destination;
this.metros = metros;
this.travelTime = travelTime;
}
/**
* Returns metros omitting the start and destination.
*/
intermediateMetros(): string[] {
const result = [...this.metros];
result.shift();
result.pop();
return result;
}
}
To test transactional methods, we need to initialize the Drivine within the test context. Here's how:
it('should return the fastest route between start and dest ', async () => {
return inDrivineContext().withTransaction({rollback: true}).run(async () => {
const result = await request(app.getHttpServer())
.get('/routes/between/Pigalle/NYC')
.expect(HttpStatus.OK);
expect(result.body[0].travelTime).toEqual(8.5);
});
});
Optionally, we can run integration and end-to-end tests run inside a roll-back transaction. This is useful when testing against shared data and/or when testing transactional database operations in parallel. Within the test it is possible to read and make assertions upon uncommitted data, after which it is rolled back, restoring the database to a clean state.
Avoid repetition. It is possible to run all tests for a given spec with Drivine,
by adding RunWithDrivine()
at the top of the spec:
RunWithDrivine();
describe('RouteRepository', () => {
let repo: RouteRepository;
beforeAll(async () => {
const app: TestingModule = await Test.createTestingModule({
imports: [AppModule],
providers: [RouteRepository],
controllers: []
}).compile();
repo = app.get(RouteRepository);
});
it('should find routes between two cities, ordered by most expedient', async () => {
const results = await repo.findRoutesBetween('Cavite Island', 'NYC');
expect(results.length).toBeGreaterThan(0);
expect(results[0].travelTime).toEqual(26);
});
});
Support multiple graph databases - simultaneously, if you wish! Currently Neo4j and AgensGraph.
Support distributed transactions - even across different typed databases.
Scale to hundreds and thousands of transactions per second, without compromising architectural integrity.
Facilitates the use of well understood object-oriented and functional programming patterns.
Supports implementation of code that adheres to a single responsibility principle (SRP).
Takes care of infrastructure concerns, so that you can focus on making the most of your data.
Removes boiler plate code, especially the tedious and error-prone kind.
Supports streaming, without back-pressure.
Large amounts of data can be managed in a timely and memory efficient manner..
Light-weight use-case specific object graph mapping (OGM).
Battle-tested — used in high-traffic production applications.
Demonstrates how to bootstrap Drivine.
Shows how Drivine takes care of boiler-plate code.
Demonstrates declarative transaction management.
Shows how to implement repositories, optionally with streaming.
Contains graph database koans for common use-cases, like recommendations, routing, social, etc.
Clone this repository, open it up in your favourite IDE or editor.
Set up either Neo4j or AgensGraph.
First time to use a graph database? You can start testing Drivine right now with Neo4j sandbox.
Define a .env file based on .env.example which contains database authentication credentials.
Proceed to the exercises here.
Start using Drivine! Check out the User Guide.
Passionate about graph databases and networkscience. Self taught musician & mathematician. Open-source veteran.
You're a little intimidating, what with the RBF and those arcane spells and all. And you're way smarter than me. You're on my team!
Life is death. Death is Life. Teach the deserving. Teach with Passion. You will guide us.
In all honesty, you're a pretty freaky guy. None of your jokes make any sense. But you get stuff done. We need you and you got the job, Shaman.
We're putting the band back together. People wonder if the Bard adds all that much. You and I know that none of this would be possible without you.
A. Yes.
Please send me a tweet @doctor_cerulean and I will
answer as best as I can.
A. Drivine is now released under the Apache 2.0 Software License.
A. Nothing! It is free. Just enjoy it.
A. If you like, however you probably don't need it :) Drivine, has been deployed in highly demanding production systems and has a growing community of users. It is a free and open-source project. If you would like to submit a feature request, please submit one on the github page. If you're not sure how to do something Drivine related, you may raise a question under the drivine tag on StackOverflow.
A. Yes it can, you can use Drivine to access multiple databases under the one code-base. These can be registered up-front or dynamically.