NoSQL

NoSQL(원래 의미: non SQL 또는 non relational)^[1] 데이터베이스는 전통적인 관계형 데이터베이스 보다 덜 제한적인 일관성 모델을 이용하는 데이터의 저장 및 검색을 위한 매커니즘을 제공한다. 이러한 접근에 대한 동기에는 디자인의 단순화, 수평적 확장성, 세세한 통제를 포함한다. NoSQL 데이터베이스는 단순 검색 및 추가 작업을 위한 매우 최적화된 키 값 저장 공간으로, 레이턴시와 스루풋과 관련하여 상당한 성능 이익을 내는 것이 목적이다. NoSQL 데이터베이스는 빅데이터와 실시간 웹 애플리케이션의 상업적 이용에 널리 쓰인다. 또, NoSQL 시스템은 SQL 계열 쿼리 언어를 사용할 수 있다는 사실을 강조한다는 면에서 "Not only SQL"로 불리기도 한다.^[2]^[3]

이러한 접근의 동기는 다음을 포함한다: 설계의 단순성, 머신들의 클러스터에 대한 더 단순한 수평 확장(관계형 데이터베이스의 문제),^[4] 이용성에 대한 더 세밀한 통제. NoSQL 데이터베이스에 의해 사용되는 자료 구조(예: 키-값, 와이드 컬럼, 그래프, 도큐먼트)들은 관계형 데이터베이스에서 기본적으로 사용되는 것들과는 다르며 일부 작업들은 NoSQL에서 속도가 더 빠른 편이다. 주어진 NoSQL 데이터베이스의 특정한 적합 여부는 해결해야 하는 문제에 따라 다르다. NoSQL 데이터베이스에 쓰이는 자료 구조들은 관계형 데이터베이스 테이블보다 "더 유연한" 것으로 간주되기도 한다.^[5]

수많은 NoSQL 스토어들은 이용성, 파티션 내구성, 속도의 선호로 (CAP 정리 측면에서) 일관성을 타협한다. NoSQL 스토어를 채용하는 데 생기는 장벽에는 저급의 쿼리 언어의 사용(SQL 사용 대신. 예: 테이블을 경유하여 애드혹 조인-join을 수행하는 기능이 부족), 표준화된 인터페이스의 부족, 기존 관계형 데이터베이스의 상당한 개선이 포함된다.^[6] 대부분의 NoSQL 스토어는 진정한 ACID 트랜잭션이 결여되어 있으나 마크로직, 에어로스파이크, 페어컴(FairCom) c-treeACE, 구글 스패너(기술적으로 NewSQL 데이터베이스이긴 하지만), Symas LMDB, OrientDB 등의 일부 데이터베이스들은 이를 염두에 두고 설계하였다.

그 대신, 대부분의 NoSQL 데이터베이스들은 "궁극적인 일관성" 개념을 제공함으로써 데이터베이스의 변경사항이 모든 노드에 "궁극적으로"(일반적으로 밀리초 내) 전파되므로 데이터에 대한 모든 쿼리들이 즉각 업데이트된 데이터를 반환하지 않을 수 있고 정확하지 않은 데이터를 읽는 결과가 발생할 수 있는데 이 문제를 스테일 리드(stale read)라고 부른다.^[7] 게다가 일부 NoSQL 시스템들은 손실된 쓰기(write)와 기타 형태의 데이터 손실을 보이는 경우도 있다.^[8] 일부 NoSQL 시스템들은 로그 선행 기입과 같은 개념들을 제공하여 데이터 손실을 막는다.^[9] 여러 데이터베이스를 거치는 분산 트랜잭션 처리의 경우 데이터 일관성은 NoSQL과 관계형 데이터베이스에게 훨씬 더 큰 도전이 된다. 현행의 관계형 데이터베이스들 조차도 데이터베이스 스팬을 위한 참조 무결성 제약(referential integrity constraint)을 허용하지 않는다.^[10] 분산 트랜잭션 처리를 위해 ACID 트랜잭션과 X/Open XA 표준을 모두 준수하는 시스템들도 일부 있다.

역사

NoSQL 이니셔티브의 최초의 시각적 표현.^[11]

카를로 스트로찌(Carlo Strozzi)는 1998년 표준 SQL 인터페이스를 채용하지 않은 자신의 경량 오픈 소스 관계형 데이터베이스를 NoSQL이라고 명명했다.^[12] 스트로찌는 현재의 NoSQL 운동이 “전반적인 관계형 모델에서 점차 멀어지고 있으므로” NoREL로 부르는 것이 더 적절하다고 언급했다.^[13]

2009년 초에 라스트 FM의 요한 오스칼손(Johan Oskarsson)이 오픈 소스 분산 데이터베이스를 논하기 위한 미트업 행사를 조직하면서, 이와 같은 데이터베이스를 NoSQL이라고 불렀다.^[14] 고전적인 관계형 데이터베이스 시스템의 주요 특성을 보장하는 ACID 제공을 주로 시도하지 않은 수많은 비관계형, 분산 데이터 자료 공간의 등장에 따라 이 이름이 사용되었다.^[15]

NoSQL 데이터베이스의 예

NoSQL 데이터베이스를 분류하는 접근 방식은 분류와 하위 분류와 함께 다양하다. 다양한 접근 방식으로 인해 비관계형 데이터베이스를 포괄적으로 파악하는 데에는 어려움이 있다. 그럼에도 동의할만한 수준의 기본적인 분류는 데이터 모델에 기반을 둔다. 이 가운데 몇 가지와 이들이 가진 프로토타입은 다음과 같다:

스티븐 옌은 자신의 블로그의 글 "NoSQL is a Horseless Carriage"에서 NoSQL 데이터베이스들을 다음과 같이 분류했다.^[16]

용어	연관 데이터베이스
KV Store	Keyspace, Flare, SchemaFree, RAMCloud, Oracle NoSQL Database (OnDB)
KV Store - Eventually consistent	Dynamo, Voldemort, Dynomite, SubRecord, Mo8onDb, DovetailDB
KV Store - Hierarchical	GT.m, Cache
KV Store - Ordered	TokyoTyrant, Lightcloud, NMDB, Luxio, MemcacheDB, Actord
KV Cache	Memcached, Repcached, Coherence, Hazelcast, Infinispan, EXtremeScale, JBossCache, Velocity, Terracotta
Tuple Store	Gigaspaces, Coord, Apache River
Object Database	ZopeDB, DB40, Shoal
Document Store	CouchDB, Cloudant, Couchbase, MongoDB, Jackrabbit, XML-Databases, ThruDB, CloudKit, Prsevere, Riak-Basho, Scalaris
Wide Columnar Store	BigTable, HBase, Apache Cassandra, Hypertable, KAI, OpenNeptune, Qbase, KDI

성능

벤 스코필드는 여러 유형의 NoSQL 데이터베이스의 등급을 다음과 같이 평가했다:^[17]

데이터 모델	성능	확장성	유연성	복잡성	기능
키-값 스토어	높음	높음	높음	없음	가변적 (없음)
컬럼 지향 스토어	높음	높음	준수	낮음	최소
도큐먼트 지향 스토어	높음	가변적 (높음)	높음	낮음	가변적 (낮음)
그래프 데이터베이스	가변적	가변적	높음	높음	그래프 이론
관계형 데이터베이스	가변적	가변적	낮음	준수	관계대수

성능과 확장성 비교는 종종 YCSB 벤치마크를 통해 이루어진다.

ACID 및 JOIN 지원

ACID 속성(원자성, 일관성, 고립성, 지속성) 또는 JOIN 동작을 지원하는 것으로 문서에 표현되어 있는 경우 지원으로 표기된다. 그러나 이 기준에는 대부분의 SQL 데이터베이스와 비슷한 방식으로 완전히 지원할 필요는 없다.

데이터베이스	ACID	JOIN
Aerospike	예	아니요
Apache Ignite	예	예
ArangoDB	예	예
Amazon DynamoDB	예	아니요
Couchbase	예	예
CouchDB	예	예
IBM Db2	예	예
InfinityDB	예	아니요
LMDB	예	아니요
MarkLogic	예	예^{[nb 1]}
MongoDB	예	예^{[nb 2]}
OrientDB	예	예^{[nb 3]}

↑ Joins do not necessarily apply to document databases, but MarkLogic can do joins using semantics.^[18]
↑ MongoDB did not support joining from a sharded collection until version 5.1.^[19]
↑ OrientDB can resolve 1:1 joins using links by storing direct links to foreign records.^[20]

같이 보기

참조

↑ http://nosql-database.org/ Archived 2018년 12월 26일 - 웨이백 머신 "NoSQL DEFINITION: Next Generation Databases mostly addressing some of the points: being non-relational, distributed, open-source and horizontally scalable"
↑ “NoSQL (Not Only SQL)”. NoSQL database, also called Not Only SQL
↑ Fowler, Martin. “NosqlDefinition”. many advocates of NoSQL say that it does not mean a "no" to SQL, rather it means Not Only SQL
↑ Leavitt, Neal (2010). “Will NoSQL Databases Live Up to Their Promise?” (PDF). 《IEEE Computer》.
↑ ^가 ^나 Vogels, Werner (2012년 1월 18일). “Amazon DynamoDB – a Fast and Scalable NoSQL Database Service Designed for Internet Scale Applications”. All Things Distributed. 2017년 3월 6일에 확인함.
↑ Grolinger, K.; Higashino, W. A.; Tiwari, A.; Capretz, M. A. M. (2013). “Data management in cloud environments: NoSQL and NewSQL data stores” (PDF). Aira, Springer. 2014년 1월 8일에 확인함.
↑ “Jepsen: MongoDB stale reads”. 《Aphyr.com》. 2015년 4월 20일. 2017년 3월 6일에 확인함.
↑ “Large volume data analysis on the Typesafe Reactive Platform”. 《Slideshare.net》. 2017년 3월 6일에 확인함.
↑ Fowler, Adam. “10 NoSQL Misconceptions”. 《Dummies.com》. 2017년 3월 6일에 확인함.
↑ “No! to SQL and No! to NoSQL | So Many Oracle Manuals, So Little Time”. 《Iggyfernandez.wordpress.com》. 2017년 3월 6일에 확인함.
↑ Sargolzaei Javan, Morteza; Mohanna, Farahnaz (2009). 〈Multi Dimensional and Flexible Model for Databases〉. 《Innovations and Advances in Computer Sciences and Engineering》. Springer. 159–164쪽.
↑ Lith, Adam; Jakob Mattson (2010). “Investigating storage solutions for large data: A comparison of well performing and scalable data storage solutions for real time extraction and batch insertion of data” (PDF). Göteborg: Department of Computer Science and Engineering, Chalmers University of Technology. 70쪽. 2011년 5월 12일에 확인함. Carlo Strozzi first used the term NoSQL in 1998 as a name for his open source relational database that did not offer a SQL interface[...]
↑ “NoSQL Relational Database Management System: Home Page”. Strozzi.it. 2007년 10월 2일. 2010년 3월 29일에 확인함.
↑ “NoSQL 2009”. Blog.sym-link.com. 2009년 5월 12일. 2011년 7월 16일에 원본 문서에서 보존된 문서. 2010년 3월 29일에 확인함.
↑ Mike Chapple. “The ACID Model”. 2016년 12월 29일에 원본 문서에서 보존된 문서. 2013년 10월 27일에 확인함.
↑ A Yes for a NoSQL Taxonomy. High Scalability (2009-11-05). Retrieved on 2013-09-18.
↑ Scofield, Ben (2010년 1월 14일). “NoSQL - Death to Relational Databases(?)”. 2014년 6월 26일에 확인함.
↑ “Can't do joins with MarkLogic? It's just a matter of Semantics! - General Networks”. 《Gennet.com》. 2017년 3월 3일에 원본 문서에서 보존된 문서. 2017년 3월 6일에 확인함.
↑ “Sharded Collection Restrictions”. 《docs.mongodb.com》. 2020년 1월 24일에 확인함.
↑ “SQL Reference · OrientDB Manual”. 《OrientDB.com》. 2020년 1월 24일에 확인함.

추가 문헌

Pramod Sadalage and Martin Fowler (2012). 《NoSQL Distilled: A Brief Guide to the Emerging World of Polyglot Persistence》. Addison-Wesley. ISBN 0-321-82662-0.
Christof Strauch (2012). “NoSQL Databases” (PDF).
Moniruzzaman AB, Hossain SA (2013). “NoSQL Database: New Era of Databases for Big data Analytics - Classification, Characteristics and Comparison”.
Kai Orend (2013). “Analysis and Classification of NoSQL Databases and Evaluation of their Ability to Replace an Object-relational Persistence Layer”.
Ganesh Krishnan, Sarang Kulkarni, Dharmesh Kirit Dadbhawala. “Method and system for versioned sharing, consolidating and reporting information”. CS1 관리 - 여러 이름 (링크)

외부 링크

Christoph Strauch. “NoSQL whitepaper” (PDF). Hochschule der Medien, Stuttgart.
Martin Fowler. “NoSQL Guide”.
Stefan Edlich. “NoSQL database List”. 2018년 12월 26일에 원본 문서에서 보존된 문서. 2013년 10월 27일에 확인함.
Peter Neubauer (2010). “Graph Databases, NOSQL and Neo4j”.
Sergey Bushik (2012). “A vendor-independent comparison of NoSQL databases: Cassandra, HBase, MongoDB, Riak”. NetworkWorld. 2014년 5월 28일에 원본 문서에서 보존된 문서. 2013년 10월 27일에 확인함.

[MarkLogicJoins-19] Joins do not necessarily apply to document databases, but MarkLogic can do joins using semantics.^[18]

[MongoDBJoin-21] MongoDB did not support joining from a sharded collection until version 5.1.^[19]

[OrientDBJoin-23] OrientDB can resolve 1:1 joins using links by storing direct links to foreign records.^[20]

[1] ttp://nosql-database.org/ Archived 2018년 12월 26일 - 웨이백 머신 "NoSQL DEFINITION: Next Generation Databases mostly addressing some of the points: being non-relational, distributed, open-source and horizontally scalable"

[2] “NoSQL (Not Only SQL)”. NoSQL database, also called Not Only SQL

[3] Fowler, Martin. “NosqlDefinition”. many advocates of NoSQL say that it does not mean a "no" to SQL, rather it means Not Only SQL

[leavitt-4] Leavitt, Neal (2010). “Will NoSQL Databases Live Up to Their Promise?” (PDF). 《IEEE Computer》.

[:1-5] 가 ^나 Vogels, Werner (2012년 1월 18일). “Amazon DynamoDB – a Fast and Scalable NoSQL Database Service Designed for Internet Scale Applications”. All Things Distributed. 2017년 3월 6일에 확인함.

[6] Grolinger, K.; Higashino, W. A.; Tiwari, A.; Capretz, M. A. M. (2013). “Data management in cloud environments: NoSQL and NewSQL data stores” (PDF). Aira, Springer. 2014년 1월 8일에 확인함.

[7] “Jepsen: MongoDB stale reads”. 《Aphyr.com》. 2015년 4월 20일. 2017년 3월 6일에 확인함.

[8] “Large volume data analysis on the Typesafe Reactive Platform”. 《Slideshare.net》. 2017년 3월 6일에 확인함.

[9] Fowler, Adam. “10 NoSQL Misconceptions”. 《Dummies.com》. 2017년 3월 6일에 확인함.

[10] “No! to SQL and No! to NoSQL | So Many Oracle Manuals, So Little Time”. 《Iggyfernandez.wordpress.com》. 2017년 3월 6일에 확인함.

[msjavan-11] Sargolzaei Javan, Morteza; Mohanna, Farahnaz (2009). 〈Multi Dimensional and Flexible Model for Databases〉. 《Innovations and Advances in Computer Sciences and Engineering》. Springer. 159–164쪽.

[:0-12] Lith, Adam; Jakob Mattson (2010). “Investigating storage solutions for large data: A comparison of well performing and scalable data storage solutions for real time extraction and batch insertion of data” (PDF). Göteborg: Department of Computer Science and Engineering, Chalmers University of Technology. 70쪽. 2011년 5월 12일에 확인함. Carlo Strozzi first used the term NoSQL in 1998 as a name for his open source relational database that did not offer a SQL interface[...]

[13] “NoSQL Relational Database Management System: Home Page”. Strozzi.it. 2007년 10월 2일. 2010년 3월 29일에 확인함.

[14] “NoSQL 2009”. Blog.sym-link.com. 2009년 5월 12일. 2011년 7월 16일에 원본 문서에서 보존된 문서. 2010년 3월 29일에 확인함.

[15] Mike Chapple. “The ACID Model”. 2016년 12월 29일에 원본 문서에서 보존된 문서. 2013년 10월 27일에 확인함.

[16] A Yes for a NoSQL Taxonomy. High Scalability (2009-11-05). Retrieved on 2013-09-18.

[17] Scofield, Ben (2010년 1월 14일). “NoSQL - Death to Relational Databases(?)”. 2014년 6월 26일에 확인함.

[18] “Can't do joins with MarkLogic? It's just a matter of Semantics! - General Networks”. 《Gennet.com》. 2017년 3월 3일에 원본 문서에서 보존된 문서. 2017년 3월 6일에 확인함.

[20] “Sharded Collection Restrictions”. 《docs.mongodb.com》. 2020년 1월 24일에 확인함.

[22] “SQL Reference · OrientDB Manual”. 《OrientDB.com》. 2020년 1월 24일에 확인함.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[nb 1]

[nb 2]

[nb 3]

[18]

[19]

[20]