UFO ET IT

관계형 데이터베이스의 키 값 쌍

ufoet 2020. 11. 9. 21:40
반응형

관계형 데이터베이스의 키 값 쌍


누군가 데이터베이스에 키-값 쌍을 저장 한 경험이 있습니까?

이 유형의 테이블을 사용하고 있습니다.

CREATE TABLE key_value_pairs ( 
    itemid           varchar(32) NOT NULL,
    itemkey         varchar(32) NOT NULL,
    itemvalue       varchar(32) NOT NULL,
    CONSTRAINT ct_primarykey PRIMARY KEY(itemid,itemkey)
)

예를 들어 다음 행이 존재할 수 있습니다.

 itemid            itemkey        itemvalue    
 ----------------  -------------  ------------ 
 123               Colour         Red            
 123               Size           Medium             
 123               Fabric         Cotton

이 체계의 문제점은 데이터를 추출하는 데 필요한 SQL 구문이 매우 복잡하다는 것입니다. 일련의 키 / 값 열을 만드는 것이 더 좋을까요?

CREATE TABLE key_value_pairs ( 
    itemid            varchar(32) NOT NULL,
    itemkey1        varchar(32) NOT NULL,
    itemvalue1      varchar(32) NOT NULL,
    itemkey2        varchar(32) NOT NULL,
    itemvalue2      varchar(32) NOT NULL,
 . . .etc . . .
)

이것은 쿼리하기가 더 쉽고 빠르지 만 첫 번째 접근 방식의 확장 성이 부족합니다. 어떤 충고?


접근 방식을 계속하기 전에 한 걸음 물러나서이 데이터를 "키-값 쌍"테이블에 정말로 저장하고 싶은지 고려해 보시기 바랍니다. 나는 당신의 응용 프로그램을 모르지만 내 경험에 따르면 당신이하는 일을 할 때마다 나중에 색상 표, 패브릭 테이블 및 크기 테이블을 만들었 으면 좋았을 것입니다.

참조 무결성 제약에 대해 생각해보십시오. 키-값 쌍 접근 방식을 사용하면 데이터베이스가 크기 필드에 색상 ID를 저장하려고 할 때 알려줄 수 없습니다.

여러 도메인에 걸쳐 수천 개의 값을 가질 수있는 일반 값과 ​​비교하여 10 개의 값이있는 테이블에서 조인 할 때의 성능 이점에 대해 생각해보십시오. Key Value에 대한 인덱스가 실제로 얼마나 유용할까요?

일반적으로 당신이하고있는 일을하는 이유는 도메인이 "사용자 정의 가능"해야하기 때문입니다. 그럴 경우에는 즉석에서 테이블을 만드는 방법을 강요하지 않을 것입니다 (비록 가능한 접근 방식이지만).

그러나 여러 테이블보다 관리가 더 쉬울 것이라고 생각하거나 모든 도메인에 대한 일반적인 유지 관리 사용자 인터페이스를 구상하고 있기 때문에 추론하는 경우 중단하고 계속하기 전에 정말 열심히 생각하십시오.


둘 사이 어딘가에있는 또 다른 해결책이 있습니다. 키 및 값에 xml 유형 열을 사용할 수 있습니다. 따라서 itemid 필드를 유지 한 다음 일부 키 값 쌍에 대해 정의 된 xml을 포함하는 xml 필드가 있습니다. <items> <item key="colour" value="red"/><item key="xxx" value="blah"/></items>그런 다음 데이터베이스에서 데이터를 추출 할 때 여러 가지 방법으로 xml을 처리 할 수 ​​있습니다. 당신의 사용법에 따라. 이것은 확장 가능한 솔루션입니다.


대부분의 경우 첫 번째 방법을 사용하는 것은 실제로 앉아서 모델을 생각하지 않았기 때문입니다. "글쎄요, 우리는 아직 열쇠가 뭔지 모릅니다." 일반적으로 이것은 매우 열악한 디자인입니다. 실제로 키를 열로 사용하는 것보다 느릴 것입니다.

나는 또한 당신의 ID가 varchar 인 이유를 의문합니다.

드물게 키 / 값 테이블을 구현해야하는 경우 첫 번째 솔루션은 괜찮습니다.하지만 일반적으로 별도의 테이블에 키를두고 싶으므로 varchars를 키 / 값 테이블에 저장하지 않습니다. 값 테이블.

예 :

CREATE TABLE valid_keys ( 
    id            NUMBER(10) NOT NULL,
    description   varchar(32) NOT NULL,
    CONSTRAINT pk_valid_keys PRIMARY KEY(id)
);

CREATE TABLE item_values ( 
    item_id NUMBER(10) NOT NULL,
    key_id  NUMBER(10) NOT NULL,
    item_value VARCHAR2(32) NOT NULL,
    CONSTRAINT pk_item_values PRIMARY KEY(item_id),
    CONSTRAINT fk_item_values_iv FOREIGN KEY (key_id) REFERENCES valid_keys (id)
);

그런 다음 미쳐서 "TYPE"을 키에 추가하여 일부 유형 검사를 허용 할 수도 있습니다.


I once used key-value pairs in a database for the purpose of creating a spreadsheet (used for data entry) in which a teller would summarize his activity from working a cash drawer. Each k/v pair represented a named cell into which the user entered a monetary amount. The primary reason for this approach is that the spreadsheet was highly subject to change. New products and services were added routinely (thus new cells appeared). Also, certain cells were not needed in certain situations and could be dropped.

The app I wrote was a rewrite of an application that did break the teller sheet into separate sections each represented in a different table. The trouble here was that as products and services were added, schema modifications were required. As with all design choices there are pros and cons to taking a certain direction as compared to another. My redesign certainly performed slower and more quickly consumed disk space; however, it was highly agile and allowed for new products and services to be added in minutes. The only issue of note, however, was disk consumption; there were no other headaches I can recall.

이미 언급했듯이 일반적으로 키-값 쌍 접근 방식을 고려하는 이유는 사용자 (비즈니스 소유자 일 수 있음)가 사용자 별 속성 집합을 갖는 고유 한 유형을 만들고 싶어 할 때입니다. 그런 상황에서 나는 다음과 같은 결정을 내렸다.

이러한 속성으로 데이터를 검색 할 필요가 없거나 데이터 청크가 검색된 후 검색이 애플리케이션으로 연기 될 수있는 경우 모든 속성을 단일 텍스트 필드 (JSON, YAML, XML 등 사용)에 저장하는 것이 좋습니다. ). 이러한 속성으로 데이터를 검색해야하는 경우에는 지저분 해집니다.

정렬 열이 실제 값을 문자열 정렬 가능 표현으로 변환하는 단일 "속성"테이블 (id, item_id, key, value, data_type, sort_value)을 만들 수 있습니다. (예 : 날짜 : "2010-12-25 12:00:00", 숫자 : "0000000001") 또는 데이터 유형 (예 : string_attributes, date_attributes, number_attributes)별로 별도의 속성 테이블을 만들 수 있습니다. 두 가지 접근 방식에 대한 수많은 장단점 중 첫 번째가 더 간단하고 두 번째가 더 빠릅니다. 둘 다 추악하고 복잡한 쿼리를 작성하게 만듭니다.


경험을 통해 특정 키가 더 널리 사용되거나 더 자주 쿼리된다는 것을 알았습니다. 그런 다음 일반적으로 기본 "항목"테이블에 특정 필드를 포함하도록 디자인을 약간 비정규 화했습니다.

eg. if every Item has a Colour, you might add the Colour column to your item table. Fabric and Size may be used less often and can be kept separate in the key-value pair table. You may even keep the colour in the key-value pair table, but duplicate the data in the item table to get the performance benefits.

Obviously this varies depending on the data and how flexible you need the key-value pairs to be. It can also result in your attribute data not being located consistantly. However, de-normalizing does greatly simplify the queries and improves their performance as well.

I would usually only consider de-normalizing when performance becomes and issue, not just to simplify a query.


PostgreSQL 8.4 supports hstore data type for storing sets of (key,value) pairs within a single PostgreSQL data field. Please refer http://www.postgresql.org/docs/8.4/static/hstore.html for its usage information. Though it's very old question but thought to pass on this info thinking it might help someone.


I don't understand why the SQL to extract data should be complex for your first design. Surely to get all values for an item, you just do this:

SELECT itemkey,itemvalue FROM key_value_pairs WHERE itemid='123';

or if you just want one particular key for that item:

SELECT itemvalue FROM key_value_pairs WHERE itemid='123' AND itemkey='Fabric';

The first design also gives you the flexibility to easily add new keys whenever you like.


I think the best way to design such tables is as follows:

  • Make the frequently used fields as columns in the database.
  • Provide a Misc column which contains a dictionary(in JSON/XML/other string formeat) which will contain the fields as key-value pairs.

Salient points:

  • You can write your normal SQL queries to query for SQL in most situations.
  • You can do a FullTextSearch on the key-value pairs. MySQL has a full text search engine, else you can use "like" queries which are a little slower. While full text search is bad, we assume that such queries are fewer, so that should not cause too many issues.
  • If your key-value pairs are simple boolean flags, this technique has the same power as having a separate column for the key. Any more complex operation on the key value pairs should be done outside the database.
  • Looking at the frequency of queries over a period of time will give tell you which key-value pairs need to be converted in columns.
  • This technique also makes it easy to force integrity constraints on the database.
  • It provides a more natural path for developers to re-factor their schema and code.

the first method is quite ok. you can create a UDF that extracts the desired data and just call that.


If you have very few possible keys, then I would just store them as columns. But if the set of possible keys is large then your first approach is good (and the second approach would be impossible).

Or is it so that each item can only have a finite number of keys, but the keys could be something from a large set?

You could also consider using an Object Relational Mapper to make querying easier.


The first method is a lot more flexible at the cost you mention.

And the second approach is never viable as you showed. Instead you'd do (as per your first example)

create table item_config (item_id int, colour varchar, size varchar, fabric varchar)

of course this will only work when the amount of data is known and doesn't change a lot.

As a general rule any application that demands changing DDL of tables to do normal work should be given a second and third thoughts.


Violating normalization rules is fine as long as the business requirement can still be fulfilled. Having key_1, value_1, key_2, value_2, ... key_n, value_n can be OK, right up until the point that you need key_n+1, value_n+1.

My solution has been a table of data for shared attributes and XML for unique attributes. That means I use both. If everything (or most things) have a size, then size is a column in the table. If only object A have attribute Z, then Z is stored as XML similar Peter Marshall's answer already given.


The second table is badly de-normalised. I would stick with the first approach.


I think you're doing the right thing, as long as the keys/values for a given type of item change frequently.
If they are rather static, then simply making the item table wider makes more sense.

We use a similar (but rather more complex) approach, with a lot of logic around the keys/values, as well as tables for the types of values permitted for each key.
This allows us to define items as just another instance of a key, and our central table maps arbitrary key types to other arbitrary key types. It can rapidly tie your brain in knots, but once you've written and encapsulated the logic to handle it all, you have a lot of flexibility.

I can write more details of what we do if required.


If the keys are dynamic, or there are loads of them, then use the mapping table that you have as your first example. In addition this is the most general solution, it scales best in the future as you add more keys, it is easy to code the SQL to get the data out, and the database will be able to optimise the query better than you would imagine (i.e., I wouldn't put effort into prematurely optimising this case unless it was proven to be a bottleneck in testing later on, in which case you could consider the next two options below).

If the keys are a known set, and there aren't many of them (<10, maybe <5), then I don't see the problem in having them as value columns on the item.

If there are a medium number of known fixed keys (10 - 30) then maybe have another table to hold the item_details.

However I don't ever see a need to use your second example structure, it looks cumbersome.


If you go the route of a KVP table, and I have to say that I do not like that technique at all myself as it is indeed difficult to query, then you should consider clustering the values for a single item id together using an appropriate technique for whatever platform you're on.

RDBMS's have a tendency to scatter rows around to avoid block contention on inserts and if you have 8 rowes to retrieve you could easily find yourself accessing 8 blocks of the table to read them. On Oracle you'd do well to consider a hash cluster for storing these, which would vastly improve performance on accessing the values for a given item id.


Your example is not a very good example of the use of key value pairs. A better example would be the use of something like a Fee table a Customer table and a Customer_Fee table in a billing application. The Fee table would consist of fields like: fee_id, fee_name, fee_description The Customer_Fee table would consist of fields like: customer_id, fee_id, fee_value


Times have changed. Now you have other database types you can use beside relational databases. NOSQL choices now include, Column Stores, Document Stores, Graph, and Multi-model (See: http://en.wikipedia.org/wiki/NoSQL).

For Key-Value databases, your choices include (but not limited to) CouchDb, Redis, and MongoDB.

참고URL : https://stackoverflow.com/questions/126271/key-value-pairs-in-relational-database

반응형