Quantcast
Channel: Severalnines - MongoDB
Viewing all 286 articles
Browse latest View live

About cloud lock-in and open source databases

$
0
0

The cloud is no longer a question of if, but of when. Many IT leaders, however, find that one consistent barrier to their adoption of the cloud is vendor lock-in. What do you do when you are forced to stay with a provider that no longer meets your needs?

But is cloud lock-in a problem?

While it appears that you are able to move your workload from one cloud to another without being penalised economically, (the utility billing methods of the major pay-as-you-go platforms like Amazon Web Services or Azure ensure that you would only pay for services used, rather than paying for provisioned resources that may or may not be used.), the reality is that it might not work if the exact services and resources that you are using aren’t available on the cloud you’re migrating to.

Hardware is commodity, and if Cloud Infrastructure as a Service (IaaS) is just about renting VMs by the hour, then Cloud IaaS has very little lock-in. But cloud lock-in occurs when you adopt services beyond basic IaaS. The major cloud vendors do not support value-added services the same way, and this is especially true for database services. AWS, Google, Microsoft Azure, Oracle and IBM have cloud database services that work differently and are proprietary in nature, in some cases with specific APIs and data models. This means that even an open source database, combined with all the cloud vendor’s behind the hood automation, may not easily be migrated to another service.

Data can be the most important asset to the organisation, and is critical to the success of cloud applications. It is also hard to move as it is stateful, meaning that the application keeps track of the state of the interaction with users and other systems. The more data a user has, the harder it is to move. Services and applications also tend to gravitate towards the data. For this reason, the cloud vendors will go to great lengths to run and manage your data. For instance, it is free, and relatively easy, to move any amount of data into an AWS EC2 instance, but you’ll have to pay to transfer data out of AWS. The database services on Amazon are only available on Amazon, so good luck if you want to migrate to a new cloud provider or use multiple hosting providers for your application. This puts you, as customer, in a weak negotiating position and locks you into your current cloud vendor.

So, AWS has RDS, Aurora and DynamoDB. Microsoft has Azure DocumentDB and Azure SQL Database. Google has Cloud BigTable, Cloud Datastore, and Cloud SQL.

Severalnines recently joined the party with the NinesControl cloud service.

There are plenty of cloud databases out there already, so what makes NinesControl different? Well, if you are not prepared to go “all in” with a single cloud provider, then you might want to have a good look at NinesControl. It allows you to separate your data from the underlying cloud infrastructure. It supports multiple clouds, you can even bring it on-prem. The automation and management builds upon ClusterControl, a proven product used in production by companies like Cisco, Monster, AVG, BT, Eurovision amongst others.

If you want to avoid cloud vendor lock-in, then take control of your data.

Author: Vinay Joosery, CEO, Severalnines AB
Vinay is a passionate advocate and builder of concepts and business around Big Data computing infrastructures. Prior to co-founding Severalnines, Vinay served as VP EMEA at Pentaho Corporation. He's also held senior management roles at MySQL / Sun Microsystems / Oracle, where he built the business around MySQL's HA and Clustering product lines having come from Ericsson's large scale real-time databases venture Alzato.


Planets9s - NinesControl announcement, scaling & sharding MongoDB - and more!

$
0
0

Welcome to this week’s Planets9s, covering all the latest resources and technologies we create around automation and management of open source database infrastructures.

Check out the new NinesControl for MySQL & MongoDB in the cloud

This week we were happy to announce NinesControl, which is taking its first steps to offer quick and easy automation and management of databases for the cloud to developers and admins of all skill levels. Built on the capabilities of ClusterControl, NinesControl is a database management cloud service, with no need to install anything. It enables users to uniformly and transparently deploy and manage polyglot databases on any cloud, with no vendor lock-in. If you haven’t seen NinesControl yet, do check it out!

Try NinesControl

Watch the replay: scaling & sharding MongoDB

In this webinar replay, Art van Scheppingen, Senior Support Engineer at Severalnines, shows you how to best plan your MongoDB scaling strategy up front and how to prevent ending up with unusable secondary nodes and shards. Art also demonstrates how to leverage ClusterControl’s MongoDB scaling capabilities and have ClusterControl manage your shards.

Watch the replay

How to deploy & monitor MySQL and MongoDB clusters in the cloud with NinesControl

As part of this week’s NinesControl announcement, we’ve published this handy blog post, which shows you how to deploy and monitor MySQL Galera, MariaDB and MongoDB clusters on DigitalOcean and Amazon Web Services using NinesControl. Before you attempt to deploy, you’ll need to configure access credentials to the cloud you’d like to run on, as per the process described in the blog below.

Read the blog

How to configure access credentials in NinesControl for AWS & Digital Ocean

Once you register for NinesControl and provide your cloud “access key”, the service will launch droplets in your region of choice and provision database nodes on them. In this blog post we show you how to configure that access to DigitalOcean and AWS. You’ll be all set to start deploying and monitoring your database cluster in the cloud of your choice with NinesControl.

Read the blog

That’s it for this week! Feel free to share these resources with your colleagues and follow us in our social media channels.

Have a good end of the week,

Jean-Jérôme Schmidt
Planets9s Editor
Severalnines AB

Become a MongoDB DBA: Sharding ins- and outs - part 2

$
0
0

In previous posts of our “Become a MongoDB DBA” series, we covered Deployment, Configuration, Monitoring (part 1), Monitoring (part 2), backup, restore, read scaling and sharding (part 1).

In the previous post we did a primer on sharding with MongoDB. We covered not only how to enable sharding on a database, and define the shard key on a collection, but also explained the theory behind it.

Once enabled on a database and collection, the data stored will keep growing and more and more chunks will be in use. Just like any database requires management, also shards need to be looked after. Some of the monitoring and management aspects of sharding, like backups, are different than with ordinary MongoDB replicaSets. Also operations may lead to scaling or rebalancing the cluster. In this second part we will focus on the monitoring and management aspects of sharding.

Monitoring shards

The most important aspect of sharding, is monitoring its performance. As the write throughput of a sharded cluster is much higher than before, you might encounter other scaling issues. So it is key to find your next bottleneck.

Connections

The most obvious one would be the number of connections going to each primary in the shard. Any range query not using the shard key, will result in multiplication of queries going to every shard. If these queries are not covered by any (usable) index, you might see a large increase in connections going from the shard router to the primary of each shard. Luckily a connection pool is used between the shard router and the primary, so unused connections will be reused.

You can keep an eye on the connection pool via the connPoolStats command:

mongos> db.runCommand( { "connPoolStats" : 1 } )
{
    "numClientConnections" : 10,
    "numAScopedConnections" : 0,
    "totalInUse" : 4,
    "totalAvailable" : 8,
    "totalCreated" : 23,
    "hosts" : {
        "10.10.34.11:27019" : {
            "inUse" : 1,
            "available" : 1,
            "created" : 1
        },
        "10.10.34.12:27018" : {
            "inUse" : 3,
            "available" : 1,
            "created" : 2
        },
        "10.10.34.15:27018" : {
            "inUse" : 0,
            "available" : 1,
            "created" : 1
        }
    },
    "replicaSets" : {
        "sh1" : {
            "hosts" : [
                {
                    "addr" : "10.10.34.12:27002",
                    "ok" : true,
                    "ismaster" : true,
                    "hidden" : false,
                    "secondary" : false,
                    "pingTimeMillis" : 0
                }
            ]
        },
...


    "ok" : 1
}

The output of this command will give you both a combined stats and a per-host stats, and additionally per replicaSet in the cluster which host is the primary. Unfortunately this means you will have to figure out yourself which host is part of which component.

Capacity planning

Another important set of metrics to watch is the total number of chunks, the chunks per node and the available diskspace on your shards. Combined together, this should give a fairly good indication how soon it is time to scale out with another shard.

You can fetch the chunks per shard from the shard status command:

mongos> sh.status()
--- Sharding Status ---
… 
databases:
{  "_id" : "shardtest",  "primary" : "sh1",  "partitioned" : true }
    shardtest.collection
        shard key: { "_id" : 1 }
        unique: false
        balancing: true
        chunks:
            sh1    1
            sh2    2
            sh3    1

Caution has to be taken here: the shard status command does not contain valid JSON output. It can rather be explained as inconsistent formatted readable text. Alternatively you can fetch the very same information from the config database on the shard router. (it actually resides on the Configserver replicaSet)

mongos> use config
switched to db config
mongos> db.config.runCommand({aggregate: "chunks", pipeline: [{$group: {"_id": {"ns": "$ns", "shard": "$shard"}, "total_chunks": {$sum: 1}}}]})
{ "_id" : { "ns" : "test.usertable", "shard" : "mongo_replica_1" }, "total_chunks" : 330 }
{ "_id" : { "ns" : "test.usertable", "shard" : "mongo_replica_0" }, "total_chunks" : 328 }
{ "_id" : { "ns" : "test.usertable", "shard" : "mongo_replica_2" }, "total_chunks" : 335 }

This aggregate query is covered by a default index placed on the chunks collection, so this will not pose a big risk in querying it occasionally.

Non-sharded databases and collections

As we described in the previous post, non-sharded databases and collections will be assigned to a default primary shard. This means the database or collection is limited to the size of this primary shard, and if written to in large volumes, could use up all remaining disk space of a shard. Once this happens the shard will obviously no longer function. Therefore it is important to watch over all existing databases and collections, and scan the config database to validate that they have been enabled for sharding.

This short script will show you the non-sharded collections on the MongoDB command line client:

use config;
var shard_collections = db.collections.find();
var sharded_names = {};
while (shard_collections.hasNext()) {
    shard = shard_collections.next();
    sharded_names[shard._id] = 1;
}


var admin_db = db.getSiblingDB("admin");
dbs = admin_db.runCommand({ "listDatabases": 1 }).databases;
dbs.forEach(function(database) {
    if (database.name != "config") {
        db = db.getSiblingDB(database.name);
        cols = db.getCollectionNames();
        cols.forEach(function(col) {
            if( col != "system.indexes" ) {
                if( shard_names[database.name + "." + col] != 1) {
                   print (database.name + "." + col);
                }
            }
        });
    }
});

It first retrieves a list of all sharded collections and saves this for later usage. Then it loops over all databases and collections, and checks if they have been sharded or not.

Maintaining shards

Once you have a sharded environment, you also need to maintain it. Basic operations, like adding shards, removing shards, rebalancing shards and making backups ensure you keep your cluster healthy and prepared for disaster. Once a shard is full, you can no longer perform write operations on it, so it essential to scale add new shards before that happens.

Adding a shard

Adding a shard is really simple: create a new replicaSet, and once it is up and running just simply add it with the following command on one of the shard routers:

mongos> sh.addShard("<replicaset_name>/<host>:<port>")

It suffices to add one host of the replicaSet, as this will seed the shard router with a host it can connect to and detect the remaining hosts.

After this it will add the shard to the cluster, and immediately make it available for all sharded collections. This also means that after adding a shard, the MongoDB shard balancer will start balancing all chunks over all shards. Since the capacity has increased and an empty shard has appeared, this means the balancer will cause an additional read and write load on all shards in the cluster. You may want to disable the balancer, if you are adding the extra shard during peak hours. Read more in the MongoDB Balancer section further down on why this happens and how to disable the balancer in these cases.

Removing a shard

Removing a shard will not be done often, as most people scale out their clusters. But just in case you ever need it, this section will describe how to remove a shard.

It is a bit harder to remove a shard than to add a shard, as this involves removing the data as well. To remove a shard, you need to find the name of the shard first.

mongos> db.adminCommand( { listShards: 1 } )
{
    "shards" : [
{ "_id" : "sh1", "host" : "sh1/10.10.34.12:27018" },
{ "_id" : "sh2", "host" : "sh2/10.10.34.15:27018" }
    ],
    "ok" : 1
}

Now we can request MongoDB to remove it, using the adminCommand:

mongos> db.adminCommand( { removeShard: "sh2" } )
{
    "msg" : "draining started successfully",
    "state" : "started",
    "shard" : "sh2",
    "note" : "you need to drop or movePrimary these databases",
    "dbsToMove" : [ ],
    "ok" : 1
}

This will start a balancing process that will migrate all data from this shard to the remaining shards. Depending on your dataset, this could take somewhere between minutes to hours to finish. Also keep in mind that you must have enough disk space available on the remaining shards, to be able to migrate the data. If not, the balancer will stop after one of the shards is full.

To watch the progress you can run the removeShard command once more:

mongos> db.adminCommand( { removeShard: "sh2" } )
{
    "msg" : "draining ongoing",
    "state" : "ongoing",
    "remaining" : {
        "chunks" : NumberLong(2),
        "dbs" : NumberLong(0)
    },
    "note" : "you need to drop or movePrimary these databases",
    "dbsToMove" : [ notsharded ],
    "ok" : 1
}

In the output from this command you can see the attribute “dbsToMove” is an array containing the database notsharded. If the array contains databases, this means this shard is the primary shard for these databases. Before removing the shard successfully, we need to drop or move the databases first. Moving is performed with the movePrimary command:

mongos> db.runCommand( { movePrimary : "notsharded", to : "sh1" } )

Once there are no more primary databases on the shard and the balancer is done with migrating data, it will wait for you to run the removeShard command once more. It will then output the state completed and finally remove the shard:

mongos> db.adminCommand( { removeShard: "sh2" } )
{
    "msg" : "removeshard completed successfully",
    "state" : "completed",
    "shard" : "sh2",
    "ok" : 1
}

MongoDB Balancer

We have mentioned the MongoDB balancer a couple of times before. The balancer is a very basic process, that has no other task than to keep the number of chunks per collection equal on every shard. So in reality it does nothing else than move around chunks of data between shards, until it is satisfied with the balance of chunks. This means it can also work against you in some cases.

The most obvious case where it can go wrong, is if you add a new shard with a larger storage capacity than the other shards. Having a shard with more capacity than the others, means the shard router will most likely add all new chunks on the shard with the largest capacity available. This means once a new chunk has been created on the new shard, another chunk will be moved to another shard to keep the number of chunks in balance. Therefore it is advisable to give equal storage space to all shards.

Another case where it can go wrong, is if your shard router splits chunks when you insert your data randomly. If these splits happen more often on one shard than the others, this means some of the existing chunks may be moved to other shards, and range queries on these chunks work less effectively as more shards need to be touched.

There are a couple of ways to influence the balancer. The balancer could create an additional IO workload on your data nodes, so you may wish to schedule the balancer to run only in off hours:

mongos> db.settings.update(
   { _id: "balancer" },
   { $set: { activeWindow : { start : "22:00", stop : "08:00" } } },
   { upsert: true }
)

If necessary you can also disable the balancer for a single collection. This would be good to do to solve the unnecessary moving of chunks, as we described earlier. Also for collections containing archived documents, you may want to enable this. To disable the balancer on a single collection, just run the following command:

mongos> sh.disableBalancing("mydata.hugearchive")
WriteResult({ "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 })

Also during backups you need to disable the balancer to make a reliable backup, as otherwise chunks of data will be moved between shards. To disable the balancer, just run the following command:

mongos> sh.setBalancerState(false);
WriteResult({ "nMatched" : 0, "nUpserted" : 1, "nModified" : 0, "_id" : "balancer" })

Don’t forget to enable the balancer again after the backup has finished. It may also be better to use a backup tool that takes a consistent backup across all shards, instead of using mongodump. More in the next section.

Backups

We have covered MongoDB backups in one of the previous posts. Everything in that blog post applies to replicaSets. Using the ordinary backup tools will allow you to make a backup of each replicaSet in the sharded cluster. With a bit of orchestration you could have them start at the same time, however this does not give you a consistent backup of the entire sharded cluster.

The problem lies in the size of each shard and the Configserver. The bigger the shard, the longer it would take to make a backup. This means if you make a backup, the backup of the Configserver probably finishes first, and the largest shard a very long time after. There may be data missing inside the configserver backup with respect to all new entries written in the meantime to the shards, and the same applies between shards. So using conventional tools to make a consistent backup is almost impossible, unless you can orchestrate every backup to finish at the same time.

That’s exactly what the Percona MongoDB consistent backup tool solves. It will orchestrate that every backup starts at the exact same time, and once it finishes backing up one of the replicaSets, it will continue to stream the oplog of that replicaSet until the last shard has finished. Restoring such a backup requires the additional oplog entries to be replayed against the replicaSets.

Managing and monitoring MongoDB shards with ClusterControl

Adding shards

Within ClusterControl you can easily add new shards with a two step wizard, opened from the actions drop down:

Here you can define the topology of the new shard.

Once the new shard has been added to the cluster, the MongoDB shard router will use it to assign new chunks to, and the balancer will automatically balance all chunks over all the shards.

Removing shards

In case you need to remove shards, you can simply remove them via the actions drop down:

This will allow you to select the shard that you wish to remove, and the shard you wish to migrate any primary databases to:

The job that removes the shard will then perform similar actions as described earlier: it will move any primary databases to the designated shard, enable the balancer and then wait for it to move all data from the shard.

Once all the data has been removed, it will remove the shard from the UI.

Consistent backups

In ClusterControl we have enabled the support for the Percona MongoDB Consistent backup tool:

To allow ClusterControl to backup with this tool, you need to install it on the ClusterControl node. ClusterControl will then use it to orchestrate a consistent backup of the MongoDB sharded cluster. This will create a backup for each replicaSet in the cluster, including the Configserver.

Conclusion

In this second blog on MongoDB sharding, we have shown how to monitor your shards, how to add or remove shards, how to rebalance them, and how to ensure you are prepared for disaster. We also have demonstrated how these operations can be performed or scheduled from ClusterControl.

In our next blog, we will dive into a couple of good sharding practices, recommended topologies and how to choose an effective sharding strategy.

Planets9s - Top 9 Tips for MySQL Replication, MongoDB Sharding & NinesControl

$
0
0

Welcome to this week’s Planets9s, covering all the latest resources and technologies we create around automation and management of open source database infrastructures.

New webinar December 6th on Top 9 Tips to manage MySQL Replication

Join our new webinar during which Krzysztof Książek, Senior Support Engineer at Severalnines, will share his 9 Top Tips on how to build a production-ready MySQL Replication environment. From OS and DB configuration checklists to schema changes and disaster recovery,  you’ll have the 9 top tips needed for a production-ready replication setup.

Sign up for the webinar

Sign up for NinesControl for MySQL & MongoDB in the cloud

Built on the capabilities of ClusterControl, NinesControl is a database management cloud service, with no need to install anything. It enables developers and admins to uniformly and transparently deploy and manage polyglot databases on any cloud, with no vendor lock-in. If you haven’t tested NinesControl yet, do check it out - it’s free :-)

Try NinesControl

Become a MongoDB DBA: Sharding ins- and outs - part 2

Having recently discussed how to enable sharding on a MongoDB database and define the shard key on a collection, as well as explained the theory behind all this, we now focus on the monitoring and management aspects of sharding. Just like any database requires management, shards need to be looked after. Some of the monitoring and management aspects of sharding, like backups, are different than with ordinary MongoDB replicaSets. Also operations may lead to scaling or rebalancing the cluster. Find out more in this new blog post.

Read the blog

That’s it for this week! Feel free to share these resources with your colleagues and follow us in our social media channels.

Have a good end of the week,

Jean-Jérôme Schmidt
Planets9s Editor
Severalnines AB

Planets9s - Eurofunk replaces Oracle with feature-rich Severalnines ClusterControl

$
0
0

Welcome to this week’s Planets9s, covering all the latest resources and technologies we create around automation and management of open source database infrastructures.

Eurofunk replaces Oracle with feature-rich Severalnines ClusterControl

This week we’re happy to announce Eurofunk, one of the largest European command centre system specialists, as our latest ClusterControl customer. Severalnines was brought on board to help manage the databases used by European blue light services’ command centres who are responsible for dispatching response teams to emergencies. Severalnines’ ClusterControl was preferred to Oracle because database speed was improved at a fraction of Oracle’s licensing costs.

Read the story

Webinar next Tuesday: How to build a stable MySQL Replication environment

If you'd like to learn how to build a stable environment with MySQL replication, this webinar is for you. From OS and DB configuration checklists to schema changes and disaster recovery, you’ll have the information needed. Join us next Tuesday as Krzysztof Książek, Senior Support Engineer at Severalnines, shares his top 9 tips on how to best build a production-ready MySQL Replication environment.

Sign up for the webinar

How to deploy MySQL & MongoDB clusters in the cloud

This blog post describes how you can easily deploy and monitor your favourite open source databases on AWS and DigitalOcean. NinesControl is a service we recently released, which helps you deploy MySQL Galera and MongoDB clusters in the cloud. As a developer, if you want unified and real-time monitoring of your database and server infrastructure with access to 100+ collected key database and host metrics with custom dashboards providing insight to your operational and historic performance … Then NinesControl is for you :-)

Read the blog

That’s it for this week! Feel free to share these resources with your colleagues and follow us in our social media channels.

Have a good end of the week,

Jean-Jérôme Schmidt
Planets9s Editor
Severalnines AB

Planets9s - 2016’s most popular s9s resources

$
0
0

Welcome to this week’s Planets9s, which is also the last edition of 2016!

Throughout the year, we’ve covered here all of the latest resources and technologies we create around automation and management of open source database infrastructures for MySQL, MongoDB and PostgreSQL. Thank you for following us and for your feedback on these resources. We look forward to continuing to interact with you all in 2017 and will strive to publish more information that’s helpful and useful in the new year.

But for now, and in no particular order, here are this year’s top 9 most popular s9s resources:

MySQL on Docker blog series - Part 1

In the first blog post of this series, we cover some basics around running MySQL in a container. The term ‘Docker’ as the container platform is used throughout the series, which also covers topics such as Docker Swarm Mode and Multi-Host Networking and more.

Read the blog(s)

MySQL Load Balancing with HAProxy - Tutorial

This tutorial walks you through how to deploy, configure and manage MySQL load balancing with HAProxy using ClusterControl.

Read the tutorial

MySQL Replication for High Availability - Tutorial

Learn about a smarter Replication setup that uses a combination of advanced replication techniques including mixed binary replication logging, auto-increment offset seeding, semi-sync replication, automatic fail-over/resynchronization and one-click addition of read slaves.

Read the tutorial

The Holy Grail Webinar: Become a MySQL DBA - Database Performance Tuning

Our most popular webinar this year discusses some of the settings that are most often tweaked and which can bring you significant improvement in the performance of your MySQL database. Performance tuning is not easy, but you can go a surprisingly long way with a few basic guidelines.

Watch the replay

Become a ClusterControl DBA - Blog Series

Follow our colleague Art van Scheppingen, Senior Support Engineer, as he covers all the basic operations of ClusterControl for MySQL, MongoDB & PostgreSQL with examples on how to do this and make most of your setup, including a deep dive per subject to save you time.

Read the series

Top 9 Tips for building a stable MySQL Replication environment - Webinar Replay

Get all the tips & tricks needed to build a stable environment using MySQL replication, as shared by Krzysztof Ksiazek, Senior Support Engineer at Severalnines.

Watch the replay

Top 9 DevOps Tips for Going in Production with Galera Cluster for MySQL / MariaDB

Galera Cluster for MySQL / MariaDB is easy to deploy, but how does it behave under real workload, scale, and during long term operation? Find out more with this popular webinar.

Watch the replay

Migrating to MySQL 5.7 - The Database Upgrade Guide

In this whitepaper, we look at the important new changes in MySQL 5.7 and show you how to plan the test process and do a live system upgrade without downtime. For those who want to avoid connection failures during slave restarts and switchover, this document goes even further and shows you how to leverage ProxySQL to achieve a graceful upgrade process.

Download the whitepaper

Become a MongoDB DBA - Blog Series

If you are a MySQL DBA you may ask yourself: “why would I install MongoDB”? This blog series provides you an excellent starting point to get yourself prepared for MongoDB. From deployment, monitoring & trending, through to scale reading and sharding, we’ve got you covered.

Read the series

As always, feel free to share these resources with your colleagues and follow us in our social media channels.

And that’s it for this year! Here’s to happy clustering in 2017!

“See” you all then,

Jean-Jérôme Schmidt
Planets9s Editor
Severalnines AB

Secure MongoDB and Protect Yourself from the Ransom Hack

$
0
0

In this blogpost we look at the recent concerns around MongoDB ransomware and security issues, and how to mitigate this threat to your own MongoDB instance.

Recently, various security blogs raised concern that a hacker is hijacking MongoDB instances and asking ransom for the data stored. It is not the first time unprotected MongoDB instances have been found vulnerable, and this stirred up the discussion around MongoDB security again.

What is the news about?

About two years ago, the university of Saarland in Germany alerted that they discovered around 40,000 MongoDB servers that were easily accessible on the internet. This meant anyone could open a connection to a MongoDB server via the internet. How did this happen?

Default binding

In the past, the MongoDB daemon bound itself to any interface. This means anyone who has access to any of the interfaces on the host where MongoDB is installed, will be able to connect to MongoDB. If the server is directly connected to a public ip address on one of these interfaces, it may be vulnerable.

Default ports

By default, MongoDB will bind to standard ports: 27017 for MongoDB replicaSets or Shard Routers, 27018 for shards and 27019 for Configservers. By scanning a network for these ports it becomes predictable if a host is running MongoDB.

Authentication

By default, MongoDB configures itself without any form of authentication enabled. This means MongoDB will not prompt for a username and password, and anyone connecting to MongoDB will be able to read and write data. Since MongoDB 2.0 authentication has been part of the product, but never has been part of the default configuration.

Authorization

Part of enabling authorization is the ability to define roles. Without authentication enabled, there will also be no authorization. This means anyone connecting to a MongoDB server without authentication enabled, will have administrative privileges too. Administrative privileges stretches from defining users to configuring MongoDB runtime.

Why is all this an issue now?

In December 2016 a hacker exploited these vulnerabilities for personal enrichment. The hacker steals and removes your data, and leaves the following message in the WARNING collection:

{
     "_id" : ObjectId("5859a0370b8e49f123fcc7da"),
     "mail" : "harak1r1@sigaint.org",
     "note" : "SEND 0.2 BTC TO THIS ADDRESS 13zaxGVjj9MNc2jyvDRhLyYpkCh323MsMq AND CONTACT THIS EMAIL WITH YOUR IP OF YOUR SERVER TO RECOVER YOUR DATABASE !"
}

Demanding 0.2 bitcoins (around $200 at this moment of writing) may not sound like a lot if you really want your data back. However in the meanwhile your website/application is not able to function normally and may be defaced, and this could potentially cost way more than the 0.2 bitcoins.

A MongoDB server is vulnerable when it has a combination of the following:

  • Bound to a public interface
  • Bound to a default port
  • No (or weak) authentication enabled
  • No firewall rules or security groups in place

The default port could be debatable. Any port scanner would also be able to identify MongoDB if it was placed under an obscured port number.

The combination of all four factors means any attacker may be able to connect to the host. Without authentication (and authorization) the attacker can do anything with the MongoDB instance. And even if authentication has been enabled on the MongoDB host, it could still be vulnerable.

Using a network port scanner (e.g. nmap) would reveal the MongoDB build info to the attacker. This means he/she is able to find potential (zero-day) exploits for your specific version, and still manage to compromise your setup. Also weak passwords (e.g. admin/admin) could pose a threat, as the attacker would have an easy point of entry.

How can you protect yourself against this threat?

There are various precautions you can take:

  • Put firewall rules or security groups in place
  • Bind MongoDB only to necessary interfaces and ports
  • Enable authentication, users and roles
  • Backup often
  • Security audits

For new deployments performed from ClusterControl, we enable authentication by default, create a separate administrator user and allow to have MongoDB listen on a different port than the default. The only part ClusterControl can’t setup, is whether the MongoDB instance is available from outside your network.

ClusterControl
Single Console for Your Entire Database Infrastructure
Deploy, manage, monitor, scale your databases on the technology stack of your choice!

Securing MongoDB

The first step to secure your MongoDB server, would be to place firewall rules or security groups in place. These will ensure only the client hosts/applications necessary will be able to connect to MongoDB. Also make sure MongoDB only binds to the interfaces that are really necessary in the mongod.conf:

# network interfaces
net:
      port: 27017
      bindIp : [127.0.0.1,172.16.1.154]

Enabling authentication and setting up users and roles would be the second step. MongoDB has an easy to follow tutorial for enabling authentication and setting up your admin user. Keep in mind that users and passwords are still the weakest link in the chain, and ensure to make those secure!

After securing, you should ensure to always have a backup of your data. Even if the hacker manages to hijack your data, with a backup and big enough oplog you would be able to perform a point-in-time restore. Scheduling (shard consistent) backups can easily be setup in our database clustering, management and automation software called ClusterControl.

Perform security audits often: scan for any open ports from outside your hosting environment. Verify that authentication has been enabled for MongoDB, and ensure the users don’t have weak passwords and/or excessive roles. For ClusterControl we have developed two advisors that will verify all this. ClusterControl advisors are open source, and the advisors can be run for free using ClusterControl community edition.

Will this be enough to protect myself against any threat?

With all these precautions in place, you will be protected against any direct threat from the internet. However keep in mind that any machine compromised in your hosting environment may still become a stepping stone to your now protected MongoDB servers. Be sure to upgrade MongoDB to the latest (patch) releases and be protected against any threat.

How to secure MongoDB from ransomware - ten tips

$
0
0

Following the flurry of blogs, articles and social postings that have been published in recent weeks in response to the attacks on MongoDB systems and related ransomware, we thought we’d clear through the fog and provide you with 10 straightforward, tested tips on how to best secure MongoDB (from attacks and ransomware).

What is ransomware?

According to the definition, ransomware is malware that secretly installs on your computer, encrypts your files and then demands a ransom to be paid to unlock your files or not publish them publicly. The ransomware that hit MongoDB users in various forms over the past weeks applies mostly to this definition. However it is not malware that hits the MongoDB instances, but it is a scripted attack from outside.

Once the attackers have taken control over your MongoDB instance, most of them hijack your data by copying it onto their own storage. After making a copy they will erase the data on your server, and leave a database with a single collection demanding ransom for your data. In addition, some also threaten to erase the backup that they hold hostage if you don’t pay within 3 days. Some victims, who allegedly paid, in the end never received their backup.

Why is this happening

The attackers are currently scanning for MongoDB instances that are publicly available, meaning anyone can connect to these servers by a simple MongoDB connect. If the server instance does not have authentication enabled, anyone can connect without providing a username and password. Also, the lack of authentication implies there is no authorization in place, and anyone connecting will be treated as an administrator of the system. This is very dangerous as now, anyone can perform administrative tasks like changing data, creating users or dropping databases.

By default MongoDB doesn’t enable authentication in its configuration, it is assumed that limiting the MongoDB server to only listen on localhost is sufficient protection. This may be true in some use cases, but anyone using MongoDB in a multi-tenant software stack will immediately enable the MongoDB server to also listen on other network devices. If these hosts can also be accessed from the internet, this poses a big threat. This is a big shortcoming of the default installation of the MongoDB installation process.

MongoDB itself can’t be fully blamed for this, as they encourage to enable authentication and do provide a tutorial on how to enable authentication and authorization. It is the user that doesn’t apply this on their publicly accessible MongoDB servers. But what if the user never knew the server was publicly available?

Ten tips on how to secure your MongoDB deployments

Tip 1: Enable authentication

This is the easiest solution to keep unwanted people outside: simply enable authentication in MongoDB. To explicitly do this, you will need to put the following lines in the mongod.conf:

security:
    Authentication: on

If you have set the replication keyfile in the mongod.conf, you will also implicitly enable authentication.

As most current attackers are after easy-to-hijack instances, they will not attempt to break into a password protected MongoDB instance.

Tip 2: Don’t use weak passwords

Enabling authentication will not give you 100% protection against any attack. Trying weak passwords may be the next weapon of choice for the attackers. Currently MongoDB does not feature a (host) lockout for too many wrong user/password attempts. Automated attacks may be able to crack weak passwords in minutes.

Setup passwords according to good and proven password guidelines, or make use of a strong password generator.

Tip 3: Authorize users by roles

Even if you have enabled authentication, don’t just give every user an administrative role. This would be very convenient from the user perspective as they can literally perform every task thinkable, and don’t require to wait for a DBA to execute this task. But for any attacker, this is just as convenient: as soon as they have entry to one single account, they also immediately have the administrative role.

MongoDB has a very strong diversification of roles, and for any type of task an applicable role is present. Ensure that the user carrying the administrative role is a user that isn’t part of the application stack. This should slim down the chances of an account breach to result into disaster.

When provisioning MongoDB from ClusterControl, we deploy new MongoDB replicaSets and sharded clusters with a separate admin and backup user.

Tip 4: Add a replication keyfile

As mentioned before in Tip #1, enabling the replication keyfile will implicitly enable authentication in MongoDB. But there is a much more important reason to add a replication keyfile: once added, only hosts with the file installed are able to join the replicaSet.

Why is this important? Adding new secondaries to a replicaSet normally requires the clusterManager role in MongoDB. Without authentication, any user can add a new host to the cluster and replicate your data over the internet. This way the attacker can silently and continuously tap into your data.

With the keyfile enabled, the authentication of the replication stream will be encrypted. This ensures nobody can spoof the ip of an existing host, and pretend to be another secondary that isn’t supposed to be part of the cluster. In ClusterControl, we deploy all MongoDB replicaSets and sharded clusters with a replication keyfile.

Tip 5: Make backups regularly

Schedule backups, to ensure you always have a recent copy of your data. Even if the attacker is able to remove your databases, they don’t have access to your oplog. So if your oplog is large enough to contain all transactions since the last backup, you can still make a point in time recovery with a backup and replay the oplog till the moment the attacker started to remove data.

ClusterControl has a very easy interface to schedule (shard consistent) backups, and restoring your data is only one click away.

Tip 6: Run MongoDB on a non-standard port

As most attackers are only scanning for the standard MongoDB ports, you could reconfigure MongoDB to run on a different port. This would not stop attackers who perform a full port scan, they will still discover an open MongoDB instance.

Standard ports are:

  • 27017 (mongod / mongos)
  • 27018 (mongod in sharded environments)
  • 27019 (configservers)
  • 2700x (some MongoDB tutorials)

This requires the following line to be added/changed in the mongod.conf and a restart is required:

net:
   port: 17027

In ClusterControl, you can deploy new MongoDB replicaSets and sharded clusters with custom port numbers.

Tip 7: Does your application require public access?

If you have enabled MongoDB to be bound on all interfaces, you may want to review if your application actually needs external access to the datastore. If your application is a single hosted solution and resides on the same host as the MongoDB server, it can suffice by binding MongoDB to localhost.

This requires the following line to be added/changed in the mongod.conf and a restart is required:

net:
   bindIp: 127.0.0.1

In many hosting and cloud environments with multi-tenant architectures, applications are put on different hosts than where the datastore resides. The application then connects to the datastore via the private (internal) network. If this applies to your environment, you need to ensure to bind MongoDB only to the private network.

This requires the following line to be added/changed in the mongod.conf and a restart is required:

net:
   bindIp: 127.0.0.1,172.16.1.234

Tip 8: Enable firewall rules or security groups

It is a good practice to enable firewall rules on your hosts, or security groups with cloud hosting. Simply disallowing the MongoDB port ranges from outside will keep most attackers outside.

There would still be another way to get in: from the inside. If the attacker would gain access to another host in your private (internal) network, they still could access your datastore. A good example would be proxying tcp/ip requests via a http server. Add firewall rules to the MongoDB instance and deny any other host except the hosts that you know for sure need access. This should, at least, limit the number of hosts that could potentially be used to get your data. And as indicated in Tip 1: enable authentication, even if someone proxies into your private network they can’t steal your data.

Also, if your application does require MongoDB to be available on the public interface, you can limit the hosts accessing the database by simply adding similar firewall rules.

ClusterControl
Single Console for Your Entire Database Infrastructure
Deploy, manage, monitor, scale your databases on the technology stack of your choice!

Tip 9: Go Hack Yourself ! Check for external connectivity

To guarantee that you have followed all previous tips, simply test if anything is exposed from external. If you don’t have a host that is outside your hosting environment, a cloud box at any hosting provider would suffice for this check.

From external, check if you can connect to your host via telnet on the command line.

In case you did change the port number of MongoDB, use the appropriate port here.

telnet your.host.com 27017

If this command returns something similar to this, the port is open:

Trying your.host.com...
Connected to your.host.com.
Escape character is '^]'.

Another method of testing would be installing nmap on the host and testing it against your host:

[you@host ~]$ sudo yum install nmap
[you@host ~]$ nmap -p 27017 --script mongodb-databases your.host.com

If nmap is able to connect, you will see something similar to this:

PORT      STATE SERVICE REASON
27017/tcp open  unknown syn-ack
| mongodb-databases:
|   ok = 1
|   databases
|     1
|       empty = false
|       sizeOnDisk = 83886080
|       name = test
|     0
|       empty = false
|       sizeOnDisk = 83886080
|       name = yourdatabase
|     2
|       empty = true
|       sizeOnDisk = 1
|       name = local
|     2
|       empty = true
|       sizeOnDisk = 1
|       name = admin
|_  totalSize = 167772160

If you only enabled authentication, nmap is able to open the open port but not list the databases:

Starting Nmap 6.40 ( http://nmap.org ) at 2017-01-16 14:36 UTC
Nmap scan report for 10.10.22.17
Host is up (0.00031s latency).
PORT      STATE SERVICE
27017/tcp open  mongodb
| mongodb-databases:
|   code = 13
|   ok = 0
|_  errmsg = not authorized on admin to execute command { listDatabases: 1.0 }

And if you managed to secure everything from external, the output would look similar to this:

Starting Nmap 6.40 ( http://nmap.org ) at 2017-01-16 14:37 UTC
Nmap scan report for 10.10.22.17
Host is up (0.00013s latency).
PORT      STATE  SERVICE
27017/tcp closed unknown

If nmap is able to connect to MongoDB, with or without authentication enabled, it can identify which MongoDB version you are running with the mongodb-info flag:

[you@host ~]$ nmap -p 27017 --script mongodb-info 10.10.22.17

Starting Nmap 6.40 ( http://nmap.org ) at 2017-01-16 14:37 UTC
Nmap scan report for 10.10.22.17
Host is up (0.00078s latency).
PORT      STATE SERVICE
27017/tcp open  mongodb
| mongodb-info:
|   MongoDB Build info
|     javascriptEngine = mozjs
|     buildEnvironment
|       distmod =
|       target_arch = x86_64
… 
|     openssl
|       running = OpenSSL 1.0.1e-fips 11 Feb 2013
|       compiled = OpenSSL 1.0.1e-fips 11 Feb 2013
|     versionArray
|       1 = 2
|       2 = 11
|       3 = -100
|       0 = 3
|     version = 3.2.10-3.0
…
|   Server status
|     errmsg = not authorized on test to execute command { serverStatus: 1.0 }
|     code = 13
|_    ok = 0

As you can see, the attacker can identify your version, build environment and even the OpenSSL libraries it is compiled against. This enables attackers to even go beyond simple authentication breaches, and exploit vulnerabilities for your specific MongoDB build. This is why it is essential to not expose MongoDB to outside your private network, and also why you need to update/patch your MongoDB servers on a regular basis.

Tip 10: Check for excessive privileges

Even if you have implemented all tips above, it wouldn’t hurt to go through all databases in MongoDB and check if any user has excessive privileges. As MongoDB authenticates your user against the database that you connect to, it could be the case that the user also has been granted additional rights to other databases.

For example:

use mydatastore
db.createUser(
  {
    user: "user",
    pwd: "password",
    roles: [ { role: "readWrite", db: "mysdatastore" },
             { role: "readWrite", db: "admin" } ]
  }
);

In addition to a weak password and the readWrite privileges on the mydatastore database, this user also has readWrite privileges on the admin database. Connecting to mydatastore and switching the database to the admin database will not issue re-authentication. In contrary: this user is allowed to read and write to the admin database.

This is a good reason to review the privileges of your users on a regular basis. You can do this by the following command:

my_mongodb_0:PRIMARY> use mydatastore
switched to db mydatastore
my_mongodb_0:PRIMARY> db.getUsers();
[
    {
        "_id" : "mysdatastore.user",
        "user" : "user",
        "db" : "mysdatastore",
        "roles" : [
            {
                "role" : "readWrite",
                "db" : "mysdatastore"
            },
            {
                "role" : "readWrite",
                "db" : "admin"
            }
        ]
    }
]

As you need to repeat this process per database, this can be a lengthy and cumbersome exercise. In ClusterControl, we have an advisor that performs this check on a daily basis. ClusterControl advisors are open source, and these advisors are part of the free version of ClusterControl.

That’s all folks! Do not hesitate to get in touch if you have any questions on how to secure your database environment.


Announcing ClusterControl 1.4 - the MySQL Replication & MongoDB Edition

$
0
0

Today we are pleased to announce the 1.4 release of ClusterControl - the all-inclusive database management system that lets you easily deploy, monitor, manage and scale highly available open source databases in any environment; on-premise or in the cloud.

This release contains key new features for MongoDB and MySQL Replication in particular, along with performance improvements and bug fixes.

Release Highlights

For MySQL

MySQL Replication

  • Enhanced multi-master deployment
  • Flexible topology management & error handling
  • Automated failover

MySQL Replication & Load Balancers

  • Deploy ProxySQL on MySQL Replication setups and monitor performance
  • HAProxy Read-Write split configuration support for MySQL Replication setups

Experimental support for Oracle MySQL Group Replication

  • Deploy Group Replication Clusters

And support for Percona XtraDB Cluster 5.7

Download ClusterControl

For MongoDB

MongoDB & sharded clusters

  • Convert a ReplicaSet to a sharded cluster
  • Add or remove shards
  • Add Mongos/Routers

More MongoDB features

  • Step down or freeze a node
  • New Severalnines database advisors for MongoDB

Download ClusterControl

View release details and resources

New MySQL Replication Features

ClusterControl 1.4 brings a number of new features to better support replication users. You are now able to deploy a multi-master replication setup in active - standby mode. One master will actively take writes, while the other one is ready to take over writes should the active master fail. From the UI, you can also easily add slaves under each master and reconfigure the topology by promoting new masters and failing over slaves.

Topology reconfigurations and master failovers are not usually possible in case of replication problems, for instance errant transactions. ClusterControl will check for issues before any failover or switchover happens. The admin can define whitelists and blacklists of which slaves to promote to master (and vice versa). This makes it easier for admins to manage their replication setups and make topology changes when needed. 

Deploy ProxySQL on MySQL Replication clusters and monitor performance

Load balancers are an essential component in database high availability. With this new release, we have extended ClusterControl with the addition of ProxySQL, created for DBAs by René Cannaò, himself a DBA trying to solve issues when working with complex replication topologies. Users can now deploy ProxySQL on MySQL Replication clusters with ClusterControl and monitor its performance.

By default, ClusterControl deploys ProxySQL in read/write split mode - your read-only traffic will be sent to slaves while your writes will be sent to a writable master. ProxySQL will also work together with the new automatic failover mechanism. Once failover happens, ProxySQL will detect the new writable master and route writes to it. It all happens automatically, without any need for the user to take action.

MongoDB & sharded clusters

MongoDB is the rising star of the Open Source databases, and extending our support for this database has brought sharded clusters in addition to replica sets. This meant we had to retrieve more metrics to our monitoring, adding advisors and provide consistent backups for sharding. With this latest release, you can now convert a ReplicaSet cluster to a sharded cluster, add or remove shards from a sharded cluster as well as add Mongos/routers to a sharded cluster.

New Severalnines database advisors for MongoDB

Advisors are mini programs that provide advice on specific database issues and we’ve added three new advisors for MongoDB in this ClusterControl release. The first one calculates the replication window, the second watches over the replication window, and the third checks for un-sharded databases/collections. In addition to this we also added a generic disk advisor. The advisor verifies if any optimizations can be done, like noatime and noop I/O scheduling, on the data disk that is being used for storage.

There are a number of other features and improvements that we have not mentioned here. You can find all details in the ChangeLog.

We encourage you to test this latest release and provide us with your feedback. If you’d like a demo, feel free to request one.

Thank you for your ongoing support, and happy clustering!

PS.: For additional tips & tricks, follow our blog: http://www.severalnines.com/blog/

Planets9s - NinesControl announcement, scaling & sharding MongoDB - and more!

$
0
0

Welcome to this week’s Planets9s, covering all the latest resources and technologies we create around automation and management of open source database infrastructures.

Check out the new NinesControl for MySQL & MongoDB in the cloud

This week we were happy to announce NinesControl, which is taking its first steps to offer quick and easy automation and management of databases for the cloud to developers and admins of all skill levels. Built on the capabilities of ClusterControl, NinesControl is a database management cloud service, with no need to install anything. It enables users to uniformly and transparently deploy and manage polyglot databases on any cloud, with no vendor lock-in. If you haven’t seen NinesControl yet, do check it out!

Try NinesControl

Watch the replay: scaling & sharding MongoDB

In this webinar replay, Art van Scheppingen, Senior Support Engineer at Severalnines, shows you how to best plan your MongoDB scaling strategy up front and how to prevent ending up with unusable secondary nodes and shards. Art also demonstrates how to leverage ClusterControl’s MongoDB scaling capabilities and have ClusterControl manage your shards.

Watch the replay

How to deploy & monitor MySQL and MongoDB clusters in the cloud with NinesControl

As part of this week’s NinesControl announcement, we’ve published this handy blog post, which shows you how to deploy and monitor MySQL Galera, MariaDB and MongoDB clusters on DigitalOcean and Amazon Web Services using NinesControl. Before you attempt to deploy, you’ll need to configure access credentials to the cloud you’d like to run on, as per the process described in the blog below.

Read the blog

How to configure access credentials in NinesControl for AWS & Digital Ocean

Once you register for NinesControl and provide your cloud “access key”, the service will launch droplets in your region of choice and provision database nodes on them. In this blog post we show you how to configure that access to DigitalOcean and AWS. You’ll be all set to start deploying and monitoring your database cluster in the cloud of your choice with NinesControl.

Read the blog

That’s it for this week! Feel free to share these resources with your colleagues and follow us in our social media channels.

Have a good end of the week,

Jean-Jérôme Schmidt
Planets9s Editor
Severalnines AB

Become a MongoDB DBA: Sharding ins- and outs - part 2

$
0
0

In previous posts of our “Become a MongoDB DBA” series, we covered Deployment, Configuration, Monitoring (part 1), Monitoring (part 2), backup, restore, read scaling and sharding (part 1).

In the previous post we did a primer on sharding with MongoDB. We covered not only how to enable sharding on a database, and define the shard key on a collection, but also explained the theory behind it.

Once enabled on a database and collection, the data stored will keep growing and more and more chunks will be in use. Just like any database requires management, also shards need to be looked after. Some of the monitoring and management aspects of sharding, like backups, are different than with ordinary MongoDB replicaSets. Also operations may lead to scaling or rebalancing the cluster. In this second part we will focus on the monitoring and management aspects of sharding.

Monitoring shards

The most important aspect of sharding, is monitoring its performance. As the write throughput of a sharded cluster is much higher than before, you might encounter other scaling issues. So it is key to find your next bottleneck.

Connections

The most obvious one would be the number of connections going to each primary in the shard. Any range query not using the shard key, will result in multiplication of queries going to every shard. If these queries are not covered by any (usable) index, you might see a large increase in connections going from the shard router to the primary of each shard. Luckily a connection pool is used between the shard router and the primary, so unused connections will be reused.

You can keep an eye on the connection pool via the connPoolStats command:

mongos> db.runCommand( { "connPoolStats" : 1 } )
{"numClientConnections" : 10,"numAScopedConnections" : 0,"totalInUse" : 4,"totalAvailable" : 8,"totalCreated" : 23,"hosts" : {"10.10.34.11:27019" : {"inUse" : 1,"available" : 1,"created" : 1
        },"10.10.34.12:27018" : {"inUse" : 3,"available" : 1,"created" : 2
        },"10.10.34.15:27018" : {"inUse" : 0,"available" : 1,"created" : 1
        }
    },"replicaSets" : {"sh1" : {"hosts" : [
                {"addr" : "10.10.34.12:27002","ok" : true,"ismaster" : true,"hidden" : false,"secondary" : false,"pingTimeMillis" : 0
                }
            ]
        },
..."ok" : 1
}

The output of this command will give you both a combined stats and a per-host stats, and additionally per replicaSet in the cluster which host is the primary. Unfortunately this means you will have to figure out yourself which host is part of which component.

Capacity planning

Another important set of metrics to watch is the total number of chunks, the chunks per node and the available diskspace on your shards. Combined together, this should give a fairly good indication how soon it is time to scale out with another shard.

You can fetch the chunks per shard from the shard status command:

mongos> sh.status()
--- Sharding Status ---
…
databases:
{  "_id" : "shardtest",  "primary" : "sh1",  "partitioned" : true }
    shardtest.collection
        shard key: { "_id" : 1 }
        unique: false
        balancing: true
        chunks:
            sh1    1
            sh2    2
            sh3    1

Caution has to be taken here: the shard status command does not contain valid JSON output. It can rather be explained as inconsistent formatted readable text. Alternatively you can fetch the very same information from the config database on the shard router. (it actually resides on the Configserver replicaSet)

mongos> use config
switched to db config
mongos> db.config.runCommand({aggregate: "chunks", pipeline: [{$group: {"_id": {"ns": "$ns", "shard": "$shard"}, "total_chunks": {$sum: 1}}}]})
{ "_id" : { "ns" : "test.usertable", "shard" : "mongo_replica_1" }, "total_chunks" : 330 }
{ "_id" : { "ns" : "test.usertable", "shard" : "mongo_replica_0" }, "total_chunks" : 328 }
{ "_id" : { "ns" : "test.usertable", "shard" : "mongo_replica_2" }, "total_chunks" : 335 }

This aggregate query is covered by a default index placed on the chunks collection, so this will not pose a big risk in querying it occasionally.

Non-sharded databases and collections

As we described in the previous post, non-sharded databases and collections will be assigned to a default primary shard. This means the database or collection is limited to the size of this primary shard, and if written to in large volumes, could use up all remaining disk space of a shard. Once this happens the shard will obviously no longer function. Therefore it is important to watch over all existing databases and collections, and scan the config database to validate that they have been enabled for sharding.

This short script will show you the non-sharded collections on the MongoDB command line client:

use config;
var shard_collections = db.collections.find();
var sharded_names = {};
while (shard_collections.hasNext()) {
    shard = shard_collections.next();
    sharded_names[shard._id] = 1;
}


var admin_db = db.getSiblingDB("admin");
dbs = admin_db.runCommand({ "listDatabases": 1 }).databases;
dbs.forEach(function(database) {
    if (database.name != "config") {
        db = db.getSiblingDB(database.name);
        cols = db.getCollectionNames();
        cols.forEach(function(col) {
            if( col != "system.indexes" ) {
                if( shard_names[database.name + "." + col] != 1) {
                   print (database.name + "." + col);
                }
            }
        });
    }
});

It first retrieves a list of all sharded collections and saves this for later usage. Then it loops over all databases and collections, and checks if they have been sharded or not.

Maintaining shards

Once you have a sharded environment, you also need to maintain it. Basic operations, like adding shards, removing shards, rebalancing shards and making backups ensure you keep your cluster healthy and prepared for disaster. Once a shard is full, you can no longer perform write operations on it, so it essential to scale add new shards before that happens.

Adding a shard

Adding a shard is really simple: create a new replicaSet, and once it is up and running just simply add it with the following command on one of the shard routers:

mongos> sh.addShard("<replicaset_name>/<host>:<port>")

It suffices to add one host of the replicaSet, as this will seed the shard router with a host it can connect to and detect the remaining hosts.

After this it will add the shard to the cluster, and immediately make it available for all sharded collections. This also means that after adding a shard, the MongoDB shard balancer will start balancing all chunks over all shards. Since the capacity has increased and an empty shard has appeared, this means the balancer will cause an additional read and write load on all shards in the cluster. You may want to disable the balancer, if you are adding the extra shard during peak hours. Read more in the MongoDB Balancer section further down on why this happens and how to disable the balancer in these cases.

Removing a shard

Removing a shard will not be done often, as most people scale out their clusters. But just in case you ever need it, this section will describe how to remove a shard.

It is a bit harder to remove a shard than to add a shard, as this involves removing the data as well. To remove a shard, you need to find the name of the shard first.

mongos> db.adminCommand( { listShards: 1 } )
{"shards" : [
{ "_id" : "sh1", "host" : "sh1/10.10.34.12:27018" },
{ "_id" : "sh2", "host" : "sh2/10.10.34.15:27018" }
    ],"ok" : 1
}

Now we can request MongoDB to remove it, using the adminCommand:

mongos> db.adminCommand( { removeShard: "sh2" } )
{"msg" : "draining started successfully","state" : "started","shard" : "sh2","note" : "you need to drop or movePrimary these databases","dbsToMove" : [ ],"ok" : 1
}

This will start a balancing process that will migrate all data from this shard to the remaining shards. Depending on your dataset, this could take somewhere between minutes to hours to finish. Also keep in mind that you must have enough disk space available on the remaining shards, to be able to migrate the data. If not, the balancer will stop after one of the shards is full.

To watch the progress you can run the removeShard command once more:

mongos> db.adminCommand( { removeShard: "sh2" } )
{"msg" : "draining ongoing","state" : "ongoing","remaining" : {"chunks" : NumberLong(2),"dbs" : NumberLong(0)
    },"note" : "you need to drop or movePrimary these databases","dbsToMove" : [ notsharded ],"ok" : 1
}

In the output from this command you can see the attribute “dbsToMove” is an array containing the database notsharded. If the array contains databases, this means this shard is the primary shard for these databases. Before removing the shard successfully, we need to drop or move the databases first. Moving is performed with the movePrimary command:

mongos> db.runCommand( { movePrimary : "notsharded", to : "sh1" } )

Once there are no more primary databases on the shard and the balancer is done with migrating data, it will wait for you to run the removeShard command once more. It will then output the state completed and finally remove the shard:

mongos> db.adminCommand( { removeShard: "sh2" } )
{"msg" : "removeshard completed successfully","state" : "completed","shard" : "sh2","ok" : 1
}

MongoDB Balancer

We have mentioned the MongoDB balancer a couple of times before. The balancer is a very basic process, that has no other task than to keep the number of chunks per collection equal on every shard. So in reality it does nothing else than move around chunks of data between shards, until it is satisfied with the balance of chunks. This means it can also work against you in some cases.

The most obvious case where it can go wrong, is if you add a new shard with a larger storage capacity than the other shards. Having a shard with more capacity than the others, means the shard router will most likely add all new chunks on the shard with the largest capacity available. This means once a new chunk has been created on the new shard, another chunk will be moved to another shard to keep the number of chunks in balance. Therefore it is advisable to give equal storage space to all shards.

Another case where it can go wrong, is if your shard router splits chunks when you insert your data randomly. If these splits happen more often on one shard than the others, this means some of the existing chunks may be moved to other shards, and range queries on these chunks work less effectively as more shards need to be touched.

There are a couple of ways to influence the balancer. The balancer could create an additional IO workload on your data nodes, so you may wish to schedule the balancer to run only in off hours:

mongos> db.settings.update(
   { _id: "balancer" },
   { $set: { activeWindow : { start : "22:00", stop : "08:00" } } },
   { upsert: true }
)

If necessary you can also disable the balancer for a single collection. This would be good to do to solve the unnecessary moving of chunks, as we described earlier. Also for collections containing archived documents, you may want to enable this. To disable the balancer on a single collection, just run the following command:

mongos> sh.disableBalancing("mydata.hugearchive")
WriteResult({ "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 })

Also during backups you need to disable the balancer to make a reliable backup, as otherwise chunks of data will be moved between shards. To disable the balancer, just run the following command:

mongos> sh.setBalancerState(false);
WriteResult({ "nMatched" : 0, "nUpserted" : 1, "nModified" : 0, "_id" : "balancer" })

Don’t forget to enable the balancer again after the backup has finished. It may also be better to use a backup tool that takes a consistent backup across all shards, instead of using mongodump. More in the next section.

Backups

We have covered MongoDB backups in one of the previous posts. Everything in that blog post applies to replicaSets. Using the ordinary backup tools will allow you to make a backup of each replicaSet in the sharded cluster. With a bit of orchestration you could have them start at the same time, however this does not give you a consistent backup of the entire sharded cluster.

The problem lies in the size of each shard and the Configserver. The bigger the shard, the longer it would take to make a backup. This means if you make a backup, the backup of the Configserver probably finishes first, and the largest shard a very long time after. There may be data missing inside the configserver backup with respect to all new entries written in the meantime to the shards, and the same applies between shards. So using conventional tools to make a consistent backup is almost impossible, unless you can orchestrate every backup to finish at the same time.

That’s exactly what the Percona MongoDB consistent backup tool solves. It will orchestrate that every backup starts at the exact same time, and once it finishes backing up one of the replicaSets, it will continue to stream the oplog of that replicaSet until the last shard has finished. Restoring such a backup requires the additional oplog entries to be replayed against the replicaSets.

Managing and monitoring MongoDB shards with ClusterControl

Adding shards

Within ClusterControl you can easily add new shards with a two step wizard, opened from the actions drop down:

Here you can define the topology of the new shard.

Once the new shard has been added to the cluster, the MongoDB shard router will use it to assign new chunks to, and the balancer will automatically balance all chunks over all the shards.

Removing shards

In case you need to remove shards, you can simply remove them via the actions drop down:

This will allow you to select the shard that you wish to remove, and the shard you wish to migrate any primary databases to:

The job that removes the shard will then perform similar actions as described earlier: it will move any primary databases to the designated shard, enable the balancer and then wait for it to move all data from the shard.

Once all the data has been removed, it will remove the shard from the UI.

Consistent backups

In ClusterControl we have enabled the support for the Percona MongoDB Consistent backup tool:

To allow ClusterControl to backup with this tool, you need to install it on the ClusterControl node. ClusterControl will then use it to orchestrate a consistent backup of the MongoDB sharded cluster. This will create a backup for each replicaSet in the cluster, including the Configserver.

Conclusion

In this second blog on MongoDB sharding, we have shown how to monitor your shards, how to add or remove shards, how to rebalance them, and how to ensure you are prepared for disaster. We also have demonstrated how these operations can be performed or scheduled from ClusterControl.

In our next blog, we will dive into a couple of good sharding practices, recommended topologies and how to choose an effective sharding strategy.

Planets9s - Top 9 Tips for MySQL Replication, MongoDB Sharding & NinesControl

$
0
0

Welcome to this week’s Planets9s, covering all the latest resources and technologies we create around automation and management of open source database infrastructures.

New webinar December 6th on Top 9 Tips to manage MySQL Replication

Join our new webinar during which Krzysztof Książek, Senior Support Engineer at Severalnines, will share his 9 Top Tips on how to build a production-ready MySQL Replication environment. From OS and DB configuration checklists to schema changes and disaster recovery,  you’ll have the 9 top tips needed for a production-ready replication setup.

Sign up for the webinar

Sign up for NinesControl for MySQL & MongoDB in the cloud

Built on the capabilities of ClusterControl, NinesControl is a database management cloud service, with no need to install anything. It enables developers and admins to uniformly and transparently deploy and manage polyglot databases on any cloud, with no vendor lock-in. If you haven’t tested NinesControl yet, do check it out - it’s free :-)

Try NinesControl

Become a MongoDB DBA: Sharding ins- and outs - part 2

Having recently discussed how to enable sharding on a MongoDB database and define the shard key on a collection, as well as explained the theory behind all this, we now focus on the monitoring and management aspects of sharding. Just like any database requires management, shards need to be looked after. Some of the monitoring and management aspects of sharding, like backups, are different than with ordinary MongoDB replicaSets. Also operations may lead to scaling or rebalancing the cluster. Find out more in this new blog post.

Read the blog

That’s it for this week! Feel free to share these resources with your colleagues and follow us in our social media channels.

Have a good end of the week,

Jean-Jérôme Schmidt
Planets9s Editor
Severalnines AB

Planets9s - Eurofunk replaces Oracle with feature-rich Severalnines ClusterControl

$
0
0

Welcome to this week’s Planets9s, covering all the latest resources and technologies we create around automation and management of open source database infrastructures.

Eurofunk replaces Oracle with feature-rich Severalnines ClusterControl

This week we’re happy to announce Eurofunk, one of the largest European command centre system specialists, as our latest ClusterControl customer. Severalnines was brought on board to help manage the databases used by European blue light services’ command centres who are responsible for dispatching response teams to emergencies. Severalnines’ ClusterControl was preferred to Oracle because database speed was improved at a fraction of Oracle’s licensing costs.

Read the story

Webinar next Tuesday: How to build a stable MySQL Replication environment

If you'd like to learn how to build a stable environment with MySQL replication, this webinar is for you. From OS and DB configuration checklists to schema changes and disaster recovery, you’ll have the information needed. Join us next Tuesday as Krzysztof Książek, Senior Support Engineer at Severalnines, shares his top 9 tips on how to best build a production-ready MySQL Replication environment.

Sign up for the webinar

How to deploy MySQL & MongoDB clusters in the cloud

This blog post describes how you can easily deploy and monitor your favourite open source databases on AWS and DigitalOcean. NinesControl is a service we recently released, which helps you deploy MySQL Galera and MongoDB clusters in the cloud. As a developer, if you want unified and real-time monitoring of your database and server infrastructure with access to 100+ collected key database and host metrics with custom dashboards providing insight to your operational and historic performance … Then NinesControl is for you :-)

Read the blog

That’s it for this week! Feel free to share these resources with your colleagues and follow us in our social media channels.

Have a good end of the week,

Jean-Jérôme Schmidt
Planets9s Editor
Severalnines AB

Planets9s - 2016’s most popular s9s resources

$
0
0

Welcome to this week’s Planets9s, which is also the last edition of 2016!

Throughout the year, we’ve covered here all of the latest resources and technologies we create around automation and management of open source database infrastructures for MySQL, MongoDB and PostgreSQL. Thank you for following us and for your feedback on these resources. We look forward to continuing to interact with you all in 2017 and will strive to publish more information that’s helpful and useful in the new year.

But for now, and in no particular order, here are this year’s top 9 most popular s9s resources:

MySQL on Docker blog series - Part 1

In the first blog post of this series, we cover some basics around running MySQL in a container. The term ‘Docker’ as the container platform is used throughout the series, which also covers topics such as Docker Swarm Mode and Multi-Host Networking and more.

Read the blog(s)

MySQL Load Balancing with HAProxy - Tutorial

This tutorial walks you through how to deploy, configure and manage MySQL load balancing with HAProxy using ClusterControl.

Read the tutorial

MySQL Replication for High Availability - Tutorial

Learn about a smarter Replication setup that uses a combination of advanced replication techniques including mixed binary replication logging, auto-increment offset seeding, semi-sync replication, automatic fail-over/resynchronization and one-click addition of read slaves.

Read the tutorial

The Holy Grail Webinar: Become a MySQL DBA - Database Performance Tuning

Our most popular webinar this year discusses some of the settings that are most often tweaked and which can bring you significant improvement in the performance of your MySQL database. Performance tuning is not easy, but you can go a surprisingly long way with a few basic guidelines.

Watch the replay

Become a ClusterControl DBA - Blog Series

Follow our colleague Art van Scheppingen, Senior Support Engineer, as he covers all the basic operations of ClusterControl for MySQL, MongoDB & PostgreSQL with examples on how to do this and make most of your setup, including a deep dive per subject to save you time.

Read the series

Top 9 Tips for building a stable MySQL Replication environment - Webinar Replay

Get all the tips & tricks needed to build a stable environment using MySQL replication, as shared by Krzysztof Ksiazek, Senior Support Engineer at Severalnines.

Watch the replay

Top 9 DevOps Tips for Going in Production with Galera Cluster for MySQL / MariaDB

Galera Cluster for MySQL / MariaDB is easy to deploy, but how does it behave under real workload, scale, and during long term operation? Find out more with this popular webinar.

Watch the replay

Migrating to MySQL 5.7 - The Database Upgrade Guide

In this whitepaper, we look at the important new changes in MySQL 5.7 and show you how to plan the test process and do a live system upgrade without downtime. For those who want to avoid connection failures during slave restarts and switchover, this document goes even further and shows you how to leverage ProxySQL to achieve a graceful upgrade process.

Download the whitepaper

Become a MongoDB DBA - Blog Series

If you are a MySQL DBA you may ask yourself: “why would I install MongoDB”? This blog series provides you an excellent starting point to get yourself prepared for MongoDB. From deployment, monitoring & trending, through to scale reading and sharding, we’ve got you covered.

Read the series

As always, feel free to share these resources with your colleagues and follow us in our social media channels.

And that’s it for this year! Here’s to happy clustering in 2017!

“See” you all then,

Jean-Jérôme Schmidt
Planets9s Editor
Severalnines AB

Secure MongoDB and Protect Yourself from the Ransom Hack

$
0
0

In this blogpost we look at the recent concerns around MongoDB ransomware and security issues, and how to mitigate this threat to your own MongoDB instance.

Recently, various security blogs raised concern that a hacker is hijacking MongoDB instances and asking ransom for the data stored. It is not the first time unprotected MongoDB instances have been found vulnerable, and this stirred up the discussion around MongoDB security again.

What is the news about?

About two years ago, the university of Saarland in Germany alerted that they discovered around 40,000 MongoDB servers that were easily accessible on the internet. This meant anyone could open a connection to a MongoDB server via the internet. How did this happen?

Default binding

In the past, the MongoDB daemon bound itself to any interface. This means anyone who has access to any of the interfaces on the host where MongoDB is installed, will be able to connect to MongoDB. If the server is directly connected to a public ip address on one of these interfaces, it may be vulnerable.

Default ports

By default, MongoDB will bind to standard ports: 27017 for MongoDB replicaSets or Shard Routers, 27018 for shards and 27019 for Configservers. By scanning a network for these ports it becomes predictable if a host is running MongoDB.

Authentication

By default, MongoDB configures itself without any form of authentication enabled. This means MongoDB will not prompt for a username and password, and anyone connecting to MongoDB will be able to read and write data. Since MongoDB 2.0 authentication has been part of the product, but never has been part of the default configuration.

Authorization

Part of enabling authorization is the ability to define roles. Without authentication enabled, there will also be no authorization. This means anyone connecting to a MongoDB server without authentication enabled, will have administrative privileges too. Administrative privileges stretches from defining users to configuring MongoDB runtime.

Why is all this an issue now?

In December 2016 a hacker exploited these vulnerabilities for personal enrichment. The hacker steals and removes your data, and leaves the following message in the WARNING collection:

{"_id" : ObjectId("5859a0370b8e49f123fcc7da"),"mail" : "harak1r1@sigaint.org","note" : "SEND 0.2 BTC TO THIS ADDRESS 13zaxGVjj9MNc2jyvDRhLyYpkCh323MsMq AND CONTACT THIS EMAIL WITH YOUR IP OF YOUR SERVER TO RECOVER YOUR DATABASE !"
}

Demanding 0.2 bitcoins (around $200 at this moment of writing) may not sound like a lot if you really want your data back. However in the meanwhile your website/application is not able to function normally and may be defaced, and this could potentially cost way more than the 0.2 bitcoins.

A MongoDB server is vulnerable when it has a combination of the following:

  • Bound to a public interface
  • Bound to a default port
  • No (or weak) authentication enabled
  • No firewall rules or security groups in place

The default port could be debatable. Any port scanner would also be able to identify MongoDB if it was placed under an obscured port number.

The combination of all four factors means any attacker may be able to connect to the host. Without authentication (and authorization) the attacker can do anything with the MongoDB instance. And even if authentication has been enabled on the MongoDB host, it could still be vulnerable.

Using a network port scanner (e.g. nmap) would reveal the MongoDB build info to the attacker. This means he/she is able to find potential (zero-day) exploits for your specific version, and still manage to compromise your setup. Also weak passwords (e.g. admin/admin) could pose a threat, as the attacker would have an easy point of entry.

How can you protect yourself against this threat?

There are various precautions you can take:

  • Put firewall rules or security groups in place
  • Bind MongoDB only to necessary interfaces and ports
  • Enable authentication, users and roles
  • Backup often
  • Security audits

For new deployments performed from ClusterControl, we enable authentication by default, create a separate administrator user and allow to have MongoDB listen on a different port than the default. The only part ClusterControl can’t setup, is whether the MongoDB instance is available from outside your network.

ClusterControl
Single Console for Your Entire Database Infrastructure
Deploy, manage, monitor, scale your databases on the technology stack of your choice!

Securing MongoDB

The first step to secure your MongoDB server, would be to place firewall rules or security groups in place. These will ensure only the client hosts/applications necessary will be able to connect to MongoDB. Also make sure MongoDB only binds to the interfaces that are really necessary in the mongod.conf:

# network interfaces
net:
      port: 27017
      bindIp : [127.0.0.1,172.16.1.154]

Enabling authentication and setting up users and roles would be the second step. MongoDB has an easy to follow tutorial for enabling authentication and setting up your admin user. Keep in mind that users and passwords are still the weakest link in the chain, and ensure to make those secure!

After securing, you should ensure to always have a backup of your data. Even if the hacker manages to hijack your data, with a backup and big enough oplog you would be able to perform a point-in-time restore. Scheduling (shard consistent) backups can easily be setup in our database clustering, management and automation software called ClusterControl.

Perform security audits often: scan for any open ports from outside your hosting environment. Verify that authentication has been enabled for MongoDB, and ensure the users don’t have weak passwords and/or excessive roles. For ClusterControl we have developed two advisors that will verify all this. ClusterControl advisors are open source, and the advisors can be run for free using ClusterControl community edition.

Will this be enough to protect myself against any threat?

With all these precautions in place, you will be protected against any direct threat from the internet. However keep in mind that any machine compromised in your hosting environment may still become a stepping stone to your now protected MongoDB servers. Be sure to upgrade MongoDB to the latest (patch) releases and be protected against any threat.


How to secure MongoDB from ransomware - ten tips

$
0
0

Following the flurry of blogs, articles and social postings that have been published in recent weeks in response to the attacks on MongoDB systems and related ransomware, we thought we’d clear through the fog and provide you with 10 straightforward, tested tips on how to best secure MongoDB (from attacks and ransomware).

What is ransomware?

According to the definition, ransomware is malware that secretly installs on your computer, encrypts your files and then demands a ransom to be paid to unlock your files or not publish them publicly. The ransomware that hit MongoDB users in various forms over the past weeks applies mostly to this definition. However it is not malware that hits the MongoDB instances, but it is a scripted attack from outside.

Once the attackers have taken control over your MongoDB instance, most of them hijack your data by copying it onto their own storage. After making a copy they will erase the data on your server, and leave a database with a single collection demanding ransom for your data. In addition, some also threaten to erase the backup that they hold hostage if you don’t pay within 3 days. Some victims, who allegedly paid, in the end never received their backup.

Why is this happening

The attackers are currently scanning for MongoDB instances that are publicly available, meaning anyone can connect to these servers by a simple MongoDB connect. If the server instance does not have authentication enabled, anyone can connect without providing a username and password. Also, the lack of authentication implies there is no authorization in place, and anyone connecting will be treated as an administrator of the system. This is very dangerous as now, anyone can perform administrative tasks like changing data, creating users or dropping databases.

By default MongoDB doesn’t enable authentication in its configuration, it is assumed that limiting the MongoDB server to only listen on localhost is sufficient protection. This may be true in some use cases, but anyone using MongoDB in a multi-tenant software stack will immediately enable the MongoDB server to also listen on other network devices. If these hosts can also be accessed from the internet, this poses a big threat. This is a big shortcoming of the default installation of the MongoDB installation process.

MongoDB itself can’t be fully blamed for this, as they encourage to enable authentication and do provide a tutorial on how to enable authentication and authorization. It is the user that doesn’t apply this on their publicly accessible MongoDB servers. But what if the user never knew the server was publicly available?

Ten tips on how to secure your MongoDB deployments

Tip 1: Enable authentication

This is the easiest solution to keep unwanted people outside: simply enable authentication in MongoDB. To explicitly do this, you will need to put the following lines in the mongod.conf:

security:
    Authentication: on

If you have set the replication keyfile in the mongod.conf, you will also implicitly enable authentication.

As most current attackers are after easy-to-hijack instances, they will not attempt to break into a password protected MongoDB instance.

Tip 2: Don’t use weak passwords

Enabling authentication will not give you 100% protection against any attack. Trying weak passwords may be the next weapon of choice for the attackers. Currently MongoDB does not feature a (host) lockout for too many wrong user/password attempts. Automated attacks may be able to crack weak passwords in minutes.

Setup passwords according to good and proven password guidelines, or make use of a strong password generator.

Tip 3: Authorize users by roles

Even if you have enabled authentication, don’t just give every user an administrative role. This would be very convenient from the user perspective as they can literally perform every task thinkable, and don’t require to wait for a DBA to execute this task. But for any attacker, this is just as convenient: as soon as they have entry to one single account, they also immediately have the administrative role.

MongoDB has a very strong diversification of roles, and for any type of task an applicable role is present. Ensure that the user carrying the administrative role is a user that isn’t part of the application stack. This should slim down the chances of an account breach to result into disaster.

When provisioning MongoDB from ClusterControl, we deploy new MongoDB replicaSets and sharded clusters with a separate admin and backup user.

Tip 4: Add a replication keyfile

As mentioned before in Tip #1, enabling the replication keyfile will implicitly enable authentication in MongoDB. But there is a much more important reason to add a replication keyfile: once added, only hosts with the file installed are able to join the replicaSet.

Why is this important? Adding new secondaries to a replicaSet normally requires the clusterManager role in MongoDB. Without authentication, any user can add a new host to the cluster and replicate your data over the internet. This way the attacker can silently and continuously tap into your data.

With the keyfile enabled, the authentication of the replication stream will be encrypted. This ensures nobody can spoof the ip of an existing host, and pretend to be another secondary that isn’t supposed to be part of the cluster. In ClusterControl, we deploy all MongoDB replicaSets and sharded clusters with a replication keyfile.

Tip 5: Make backups regularly

Schedule backups, to ensure you always have a recent copy of your data. Even if the attacker is able to remove your databases, they don’t have access to your oplog. So if your oplog is large enough to contain all transactions since the last backup, you can still make a point in time recovery with a backup and replay the oplog till the moment the attacker started to remove data.

ClusterControl has a very easy interface to schedule (shard consistent) backups, and restoring your data is only one click away.

Tip 6: Run MongoDB on a non-standard port

As most attackers are only scanning for the standard MongoDB ports, you could reconfigure MongoDB to run on a different port. This would not stop attackers who perform a full port scan, they will still discover an open MongoDB instance.

Standard ports are:

  • 27017 (mongod / mongos)
  • 27018 (mongod in sharded environments)
  • 27019 (configservers)
  • 2700x (some MongoDB tutorials)

This requires the following line to be added/changed in the mongod.conf and a restart is required:

net:
   port: 17027

In ClusterControl, you can deploy new MongoDB replicaSets and sharded clusters with custom port numbers.

Tip 7: Does your application require public access?

If you have enabled MongoDB to be bound on all interfaces, you may want to review if your application actually needs external access to the datastore. If your application is a single hosted solution and resides on the same host as the MongoDB server, it can suffice by binding MongoDB to localhost.

This requires the following line to be added/changed in the mongod.conf and a restart is required:

net:
   bindIp: 127.0.0.1

In many hosting and cloud environments with multi-tenant architectures, applications are put on different hosts than where the datastore resides. The application then connects to the datastore via the private (internal) network. If this applies to your environment, you need to ensure to bind MongoDB only to the private network.

This requires the following line to be added/changed in the mongod.conf and a restart is required:

net:
   bindIp: 127.0.0.1,172.16.1.234

Tip 8: Enable firewall rules or security groups

It is a good practice to enable firewall rules on your hosts, or security groups with cloud hosting. Simply disallowing the MongoDB port ranges from outside will keep most attackers outside.

There would still be another way to get in: from the inside. If the attacker would gain access to another host in your private (internal) network, they still could access your datastore. A good example would be proxying tcp/ip requests via a http server. Add firewall rules to the MongoDB instance and deny any other host except the hosts that you know for sure need access. This should, at least, limit the number of hosts that could potentially be used to get your data. And as indicated in Tip 1: enable authentication, even if someone proxies into your private network they can’t steal your data.

Also, if your application does require MongoDB to be available on the public interface, you can limit the hosts accessing the database by simply adding similar firewall rules.

ClusterControl
Single Console for Your Entire Database Infrastructure
Deploy, manage, monitor, scale your databases on the technology stack of your choice!

Tip 9: Go Hack Yourself ! Check for external connectivity

To guarantee that you have followed all previous tips, simply test if anything is exposed from external. If you don’t have a host that is outside your hosting environment, a cloud box at any hosting provider would suffice for this check.

From external, check if you can connect to your host via telnet on the command line.

In case you did change the port number of MongoDB, use the appropriate port here.

telnet your.host.com 27017

If this command returns something similar to this, the port is open:

Trying your.host.com...
Connected to your.host.com.
Escape character is '^]'.

Another method of testing would be installing nmap on the host and testing it against your host:

[you@host ~]$ sudo yum install nmap
[you@host ~]$ nmap -p 27017 --script mongodb-databases your.host.com

If nmap is able to connect, you will see something similar to this:

PORT      STATE SERVICE REASON
27017/tcp open  unknown syn-ack
| mongodb-databases:
|   ok = 1
|   databases
|     1
|       empty = false
|       sizeOnDisk = 83886080
|       name = test
|     0
|       empty = false
|       sizeOnDisk = 83886080
|       name = yourdatabase
|     2
|       empty = true
|       sizeOnDisk = 1
|       name = local
|     2
|       empty = true
|       sizeOnDisk = 1
|       name = admin
|_  totalSize = 167772160

If you only enabled authentication, nmap is able to open the open port but not list the databases:

Starting Nmap 6.40 ( http://nmap.org ) at 2017-01-16 14:36 UTC
Nmap scan report for 10.10.22.17
Host is up (0.00031s latency).
PORT      STATE SERVICE
27017/tcp open  mongodb
| mongodb-databases:
|   code = 13
|   ok = 0
|_  errmsg = not authorized on admin to execute command { listDatabases: 1.0 }

And if you managed to secure everything from external, the output would look similar to this:

Starting Nmap 6.40 ( http://nmap.org ) at 2017-01-16 14:37 UTC
Nmap scan report for 10.10.22.17
Host is up (0.00013s latency).
PORT      STATE  SERVICE
27017/tcp closed unknown

If nmap is able to connect to MongoDB, with or without authentication enabled, it can identify which MongoDB version you are running with the mongodb-info flag:

[you@host ~]$ nmap -p 27017 --script mongodb-info 10.10.22.17

Starting Nmap 6.40 ( http://nmap.org ) at 2017-01-16 14:37 UTC
Nmap scan report for 10.10.22.17
Host is up (0.00078s latency).
PORT      STATE SERVICE
27017/tcp open  mongodb
| mongodb-info:
|   MongoDB Build info
|     javascriptEngine = mozjs
|     buildEnvironment
|       distmod =
|       target_arch = x86_64
…
|     openssl
|       running = OpenSSL 1.0.1e-fips 11 Feb 2013
|       compiled = OpenSSL 1.0.1e-fips 11 Feb 2013
|     versionArray
|       1 = 2
|       2 = 11
|       3 = -100
|       0 = 3
|     version = 3.2.10-3.0
…
|   Server status
|     errmsg = not authorized on test to execute command { serverStatus: 1.0 }
|     code = 13
|_    ok = 0

As you can see, the attacker can identify your version, build environment and even the OpenSSL libraries it is compiled against. This enables attackers to even go beyond simple authentication breaches, and exploit vulnerabilities for your specific MongoDB build. This is why it is essential to not expose MongoDB to outside your private network, and also why you need to update/patch your MongoDB servers on a regular basis.

Tip 10: Check for excessive privileges

Even if you have implemented all tips above, it wouldn’t hurt to go through all databases in MongoDB and check if any user has excessive privileges. As MongoDB authenticates your user against the database that you connect to, it could be the case that the user also has been granted additional rights to other databases.

For example:

use mydatastore
db.createUser(
  {
    user: "user",
    pwd: "password",
    roles: [ { role: "readWrite", db: "mysdatastore" },
             { role: "readWrite", db: "admin" } ]
  }
);

In addition to a weak password and the readWrite privileges on the mydatastore database, this user also has readWrite privileges on the admin database. Connecting to mydatastore and switching the database to the admin database will not issue re-authentication. In contrary: this user is allowed to read and write to the admin database.

This is a good reason to review the privileges of your users on a regular basis. You can do this by the following command:

my_mongodb_0:PRIMARY> use mydatastore
switched to db mydatastore
my_mongodb_0:PRIMARY> db.getUsers();
[
    {"_id" : "mysdatastore.user","user" : "user","db" : "mysdatastore","roles" : [
            {"role" : "readWrite","db" : "mysdatastore"
            },
            {"role" : "readWrite","db" : "admin"
            }
        ]
    }
]

As you need to repeat this process per database, this can be a lengthy and cumbersome exercise. In ClusterControl, we have an advisor that performs this check on a daily basis. ClusterControl advisors are open source, and these advisors are part of the free version of ClusterControl.

That’s all folks! Do not hesitate to get in touch if you have any questions on how to secure your database environment.

Announcing ClusterControl 1.4 - the MySQL Replication & MongoDB Edition

$
0
0

Today we are pleased to announce the 1.4 release of ClusterControl - the all-inclusive database management system that lets you easily deploy, monitor, manage and scale highly available open source databases in any environment; on-premise or in the cloud.

This release contains key new features for MongoDB and MySQL Replication in particular, along with performance improvements and bug fixes.

Release Highlights

For MySQL

MySQL Replication

  • Enhanced multi-master deployment
  • Flexible topology management & error handling
  • Automated failover

MySQL Replication & Load Balancers

  • Deploy ProxySQL on MySQL Replication setups and monitor performance
  • HAProxy Read-Write split configuration support for MySQL Replication setups

Experimental support for Oracle MySQL Group Replication

  • Deploy Group Replication Clusters

And support for Percona XtraDB Cluster 5.7

Download ClusterControl

For MongoDB

MongoDB & sharded clusters

  • Convert a ReplicaSet to a sharded cluster
  • Add or remove shards
  • Add Mongos/Routers

More MongoDB features

  • Step down or freeze a node
  • New Severalnines database advisors for MongoDB

Download ClusterControl

View release details and resources

ClusterControl
Single Console for Your Entire Database Infrastructure
Find out what else is new in ClusterControl

New MySQL Replication Features

ClusterControl 1.4 brings a number of new features to better support replication users. You are now able to deploy a multi-master replication setup in active - standby mode. One master will actively take writes, while the other one is ready to take over writes should the active master fail. From the UI, you can also easily add slaves under each master and reconfigure the topology by promoting new masters and failing over slaves.

Topology reconfigurations and master failovers are not usually possible in case of replication problems, for instance errant transactions. ClusterControl will check for issues before any failover or switchover happens. The admin can define whitelists and blacklists of which slaves to promote to master (and vice versa). This makes it easier for admins to manage their replication setups and make topology changes when needed. 

Deploy ProxySQL on MySQL Replication clusters and monitor performance

Load balancers are an essential component in database high availability. With this new release, we have extended ClusterControl with the addition of ProxySQL, created for DBAs by René Cannaò, himself a DBA trying to solve issues when working with complex replication topologies. Users can now deploy ProxySQL on MySQL Replication clusters with ClusterControl and monitor its performance.

By default, ClusterControl deploys ProxySQL in read/write split mode - your read-only traffic will be sent to slaves while your writes will be sent to a writable master. ProxySQL will also work together with the new automatic failover mechanism. Once failover happens, ProxySQL will detect the new writable master and route writes to it. It all happens automatically, without any need for the user to take action.

MongoDB & sharded clusters

MongoDB is the rising star of the Open Source databases, and extending our support for this database has brought sharded clusters in addition to replica sets. This meant we had to retrieve more metrics to our monitoring, adding advisors and provide consistent backups for sharding. With this latest release, you can now convert a ReplicaSet cluster to a sharded cluster, add or remove shards from a sharded cluster as well as add Mongos/routers to a sharded cluster.

New Severalnines database advisors for MongoDB

Advisors are mini programs that provide advice on specific database issues and we’ve added three new advisors for MongoDB in this ClusterControl release. The first one calculates the replication window, the second watches over the replication window, and the third checks for un-sharded databases/collections. In addition to this we also added a generic disk advisor. The advisor verifies if any optimizations can be done, like noatime and noop I/O scheduling, on the data disk that is being used for storage.

There are a number of other features and improvements that we have not mentioned here. You can find all details in the ChangeLog.

We encourage you to test this latest release and provide us with your feedback. If you’d like a demo, feel free to request one.

Thank you for your ongoing support, and happy clustering!

PS.: For additional tips & tricks, follow our blog: http://www.severalnines.com/blog/

Watch the evolution of ClusterControl for MySQL & MongoDB

$
0
0

ClusterControl reduces complexity of managing your database infrastructure on premise or in the cloud, while adding support for new technologies; enabling you to truly automate mixed environments for next-level applications.

Since the launch of ClusterControl in 2012, we’ve experienced growth in new industries with customers who are benefiting from the advancements ClusterControl has to offer.

In addition to reaching new highs in ClusterControl demand, this past year we’ve doubled the size of our team allowing us to continue to provide even more improvements to ClusterControl.

Watch this short video to see where ClusterControl stands today.

New MongoDB features in ClusterControl 1.4

$
0
0

Our latest release of ClusterControl turns some of the most troublesome MongoDB tasks into a mere 15 second job. New features have been added to give you more control over your cluster and perform topology changes:

  • Convert a MongoDB replicaSet to a sharded MongoDB Cluster
  • Add and remove shards
  • Add shard routers to a sharded MongoDB cluster
  • Step down or freeze a node
  • New MongoDB advisors

We will describe these added features in depth below.

Convert a MongoDB replicaSet to a sharded MongoDB cluster

As most MongoDB users will start off with a replicaSet to store their database, this is the most frequently used type of cluster. If you happen to run into scaling issues you can scale this replicaSet by either adding more secondaries or scaling out by sharding. You can convert an existing replicaSet into a sharded cluster, however this is a long process where you could easily make errors. In ClusterControl we have automated this process, where we automatically add the Configservers, shard routers and enable sharding.

To convert a replicaSet into a sharded cluster, you can simply trigger it via the actions drop down:

This will open up a two step dialogue on how to convert this into a shard. The first step is to define where to deploy the Configserver and shard routers to:

The second step is where to store the data and which config files should be used for the Configserver and shard router.

After the shard migration job has finished, the cluster overview now displays shards instead of replicaSet instances:

After converting to a sharded cluster, new shards can be added.

Add or remove shards from a sharded MongoDB cluster

Adding shards

As a MongoDB shard is technically a replicaSet, adding a new shard involves the deployment of a new replicaSet as well. Within ClusterControl we first deploy a new replicaSet and then add it to the sharded cluster.

From the ClusterControl UI, you can easily add new shards with a two step wizard, opened from the actions drop down:

Here you can define the topology of the new shard.

Once the new shard has been added to the cluster, the MongoDB shard router will start to assign new chunks to it, and the balancer will automatically balance all chunks over all the shards.

Removing shards

Removing shards is a bit harder than to add a shard, as this involves moving the data to the other shards before removing the shard itself. For all data that has been sharded over all shards, this will be a job performed by the MongoDB balancer.

However any non-sharded database/collection, that was assigned this shard as its primary shard, needs to be moved to another shard and made its new primary shard. For this process, MongoDB needs to know where to move these non-sharded databases/collections to.

In ClusterControl you can simply remove them via the actions drop down:

This will allow you to select the shard that you wish to remove, and the shard you wish to migrate any primary databases to:

The job that removes the shard will then perform similar actions as described earlier: it will move any primary databases to the designated shard, enable the balancer and then wait for it to move all data from the shard.

Once all the data has been removed, it will remove the shard from the UI.

Adding additional MongoDB shard routers

Once you start to scale out your application using a MongoDB sharded cluster, you may find you are in need of additional shard routers.

Adding additional MongoDB shard routers is a very simple process with ClusterControl, just open the Add Node dialogue from the actions drop down:

This will add a new shard router to the cluster. Don’t forget to set the proper default port (27017) on the router.

Step down server

In case you wish to perform maintenance on the primary node in a replicaSet, it is better to have it first “step down” in a graceful manner before taking it offline. Stepping down a primary basically means the host stops being a primary and becomes a secondary and is not eligible to become a primary for a set number of seconds. The nodes in the MongoDB replicaSet with voting power, will elect a new primary with the stepped down primary excluded for the set number of seconds.

In ClusterControl we have added the step down functionality as an action on the Nodes page. To step down, simply select this as an action from the drop down:

After setting the number of seconds for stepdown and confirming, the primary will step down and a new primary will be elected.

Freeze a node

This functionality is similar to the step down command: this makes a certain node ineligible to become a primary for a set number of seconds. This means you could prevent one or more secondary nodes to become a primary when stepping down the primary, and force a certain node to become the new primary this way.

In ClusterControl we have added the freeze node functionality as an action on the Nodes page. To freeze a node, simply select this as an action from the drop down:

After setting the number of seconds and confirming, the node will not be eligible as primary for the set number of seconds.

New MongoDB advisors

Advisors are mini programs that provide advice on specific database issues. We’ve added  three new advisors for MongoDB. The first one calculates the replication window, the second watches over the replication window, and the third checks for un-sharded databases/collections.

MongoDB Replication Lag advisor

Replication lag is very important to keep an eye on, if you are scaling out reads via adding more secondaries. MongoDB will only use these secondaries if they don’t lag too far behind. If the secondary has replication lag, you risk serving out stale data that already has been overwritten on the primary.

To check the replication lag, it suffices to connect to the primary and retrieve this data using the replSetGetStatus command. In contrary to MySQL, the primary keeps track of the replication status of its secondaries.

We have implemented this check into an advisor in ClusterControl, to ensure your replication lag will always be watched over.

MongoDB Replication Window advisor

Just like the replication lag, the replication window is an equally important metric to look at. The lag advisor already informs us of the number of seconds a secondary node is behind the primary/master. As the oplog is limited in size, having slave lag imposes the following risks:

  1. If a node lags too far behind, it may not be able to catch up anymore as the transactions necessary to catch up are no longer in the oplog of the primary.
  2. A lagging secondary node is less favoured in a MongoDB election for a new primary. If all secondaries are lagging behind in replication, you will have a problem and one with the least lag will be made primary.
  3. Secondaries lagging behind are less favoured by the MongoDB driver when scaling out reads with MongoDB, it also adds a higher workload on the remaining secondaries.

If we would have a secondary node lagging behind by a few minutes (or hours), it would be useful to have an advisor that informs us how much time we have left before our next transaction will be dropped from the oplog. The time difference between the first and last entry in the oplog is called the Replication Window. This metric can be created by fetching the first and last items from the oplog, and calculating the difference of their timestamps.

In the MongoDB shell, there is already a function available that calculates the replication window for you. However this function is built into the command line shell, so any outside connection not using the command line shell will not have this built-in function. Therefore we have made an advisor that will watch over the replication window and alerts you if you exceed a pre-set threshold.

MongoDB un-sharded databases and collections advisor

Non-sharded databases and collections will be assigned to a default primary shard by the MongoDB shard router. This means the database or collection is limited to the size of this primary shard, and if written to in large volumes, could use up all remaining disk space of a shard. Once this happens the shard will obviously no longer function. Therefore it is important to watch over all existing databases and collections, and scan the config database to validate that they have been enabled for sharding.

To prevent this from happening, we have created an un-sharded database and collection advisor. This advisor will scan every database and collection, and warn you if it has not been sharded.

ClusterControl improved the MongoDB maintainability

We have made a big step by adding all the improvements to ClusterControl for MongoDB replicaSets and sharded clusters. This improves the usability for MongoDB greatly, and allows DBAs, sysops and devops to maintain their clusters even better!

How to automate & manage MySQL (Replication) & MongoDB with ClusterControl - live webinar

$
0
0

Join us next Tuesday, February 7th 2017, as Johan Andersson, CTO at Severalnines, unveils the new ClusterControl 1.4 in a live demo webinar.

ClusterControl reduces complexity of managing your database infrastructure while adding support for new technologies; enabling you to truly automate multiple environments for next-level applications. This latest release further builds out the functionality of ClusterControl to allow you to manage and secure your 24/7, mission critical infrastructures.

In this live webinar, Johan will demonstrate how ClusterControl increases your efficiency by giving you a single interface to deploy and operate your databases, instead of searching for and cobbling together a combination of open source tools, utilities and scripts that need constant updates and maintenance. Watch as ClusterControl demystifies the complexity associated with database high availability, load balancing, recovery and your other everyday struggles.

To put it simply: learn how to be a database hero with ClusterControl!

Date, Time & Registation

Europe/MEA/APAC

Tuesday, February 7th at 09:00 GMT (UK) / 10:00 CET (Germany, France, Sweden)

Register Now

North America/LatAm

Tuesday, February 7th at 9:00 Pacific Time (US) / 12:00 Eastern Time (US)

Register Now

Agenda

  • ClusterControl (1.4) Overview
  • ‘Always on Databases’ with enhanced MySQL Replication functions
  • ‘Safer NoSQL’ with MongoDB and larger sharded cluster deployments
  • ‘Enabling the DBA’ with ProxySQL, HAProxy and MaxScale
  • Backing up your open source databases
  • Live Demo
  • Q&A

Speaker

Johan Andersson, CTO, Severalnines

Johan's technical background and interest are in high performance computing as demonstrated by the work he did on main-memory clustered databases at Ericsson as well as his research on parallel Java Virtual Machines at Trinity College Dublin in Ireland. Prior to co-founding Severalnines, Johan was Principal Consultant and lead of the MySQL Clustering & High Availability consulting group at MySQL / Sun Microsystems / Oracle, where he designed and implemented large-scale MySQL systems for key customers. Johan is a regular speaker at MySQL User Conferences as well as other high profile community gatherings with popular talks and tutorials around architecting and tuning MySQL Clusters.

We look forward to “seeing” you there and to insightful discussions!

If you have any questions or would like a personalised live demo, please do contact us.

Viewing all 286 articles
Browse latest View live