PostgreSQL Oracle/sql server/SYBASE 兼容性之 - nchar , nvarchar , ntext (unicode character)

在Oracle, MS SQL, SYBASE数据库引擎中，有一些这样的字符串类型nchar , nvarchar , ntext。

这些类型是什么含义？在PostgreSQL中应该使用什么类型与之对应？

以MS SQL为例

https://docs.microsoft.com/en-us/sql/t-sql/data-types/nchar-and-nvarchar-transact-sql?view=sql-server-2016

Character data types that are either fixed-length, nchar, or variable-length, nvarchar, Unicode data and use the UNICODE UCS-2 character set.

nchar [ ( n ) ]

Fixed-length Unicode string data. n defines the string length and must be a value from 1 through 4,000.

实际上，就两层含义

1、存储的是unicode编码的字符串，使用UNICODE UCS-2字符集。

2、长度指的是字符个数，而非字节数。

nchar , nvarchar , ntext 在 PostgreSQL的对应关系

介于以上介绍的两点，只要满足以下条件，char,varchar,text即对应到了nchar , nvarchar , ntext类型。

1、在PostgreSQL中使用UTF8字符集时，实际上就是unicode（别名）。

https://www.postgresql.org/docs/devel/static/multibyte.html

满足以上条件即可用char,varchar,text直接替代nchar,nvarchar,ntext，因为在PostgreSQL中char(n),varchar(n)，任何时候就是指的字符长度（而不是字节长度）。

如果PG 目标DB不是utf8字符集，又当如何

可以有两种方法：

1、实际上依旧可以使用char, varchar, text类型存储(长度限制与上游保持一致即可)，只是建议业务方做一下字符集转换后存入PG。

源端存储的 unicode 字符串转换为 PG目标库的目标字符集字符串。

postgres=# \df convert*  
                             List of functions  
   Schema   |     Name     | Result data type | Argument data types | Type   
------------+--------------+------------------+---------------------+------  
 pg_catalog | convert      | bytea            | bytea, name, name   | func  
 pg_catalog | convert_from | text             | bytea, name         | func  
 pg_catalog | convert_to   | bytea            | text, name          | func  
(3 rows)  
dbtest1=# select convert_to(N'你好中国','sqlascii');  
     convert_to       
--------------------  
 \xc4e3bac3d6d0b9fa  
(1 row)  
dbtest1=# select convert_to(N'你好中国','utf8');  
         convert_to           
----------------------------  
 \xe4bda0e5a5bde4b8ade59bbd  
(1 row)  
2、或者你在PG中可以使用字节流存储 来自源库unicode 字符串的字节流，读取时再转换为当前字符集。
dbtest1=# \l  
                                 List of databases  
   Name    |  Owner   | Encoding |  Collate   |   Ctype    |   Access privileges     
-----------+----------+----------+------------+------------+-----------------------  
 dbtest1   | postgres | EUC_CN   | C          | C          |   
create table test (id int, content bytea);  
insert into test values (1, convert_to(N'你好中国','utf8'));  
dbtest1=# select * from test;  
 id |          content             
----+----------------------------  
  1 | \xe4bda0e5a5bde4b8ade59bbd  
(1 row)  
dbtest1=# select convert_from(content,'utf8') from test;  
 convert_from   
--------------  
(1 row)  
PostgreSQL nchar , nvarchar , ntext 的输入语法
N quote
dbtest1=# select N'abc你好中国';  
   bpchar      
-------------  
 abc你好中国  
(1 row)  
1、src/backend/parser/scan.l
 * To ensure that {quotecontinue} can be scanned without having to back up  
 * if the full pattern isn't matched, we include trailing whitespace in  
 * {quotestop}.  This matches all cases where {quotecontinue} fails to match,  
 * except for {quote} followed by whitespace and just one "-" (not two,  
 * which would start a {comment}).  To cover that we have {quotefail}.  
 * The actions for {quotestop} and {quotefail} must throw back characters  
 * beyond the quote proper.  
quote                   '  
quotestop               {quote}{whitespace}*  
quotecontinue   {quote}{whitespace_with_newline}{quote}  
quotefail               {quote}{whitespace}*"-"  
/* Bit string  
 * It is tempting to scan the string for only those characters  
 * which are allowed. However, this leads to silently swallowed  
 * characters if illegal characters are included in the string.  
 * For example, if xbinside is [01] then B'ABCD' is interpreted  
 * as a zero-length string, and the ABCD' is lost!  
 * Better to pass the string forward and let the input routines  
 * validate the contents.  
xbstart                 [bB]{quote}  
xbinside                [^']*  
/* Hexadecimal number */  
xhstart                 [xX]{quote}  
xhinside                [^']*  
输入nchar,nvarchar方法:
/* National character */  
xnstart                 [nN]{quote}  
/* Quoted string that allows backslash escapes */  
xestart                 [eE]{quote}  
xeinside                [^\\']+  
xeescape                [\\][^0-7]  
xeoctesc                [\\][0-7]{1,3}  
xehexesc                [\\]x[0-9A-Fa-f]{1,2}  
xeunicode               [\\](u[0-9A-Fa-f]{4}|U[0-9A-Fa-f]{8})  
xeunicodefail   [\\](u[0-9A-Fa-f]{0,3}|U[0-9A-Fa-f]{0,7})  
/* Extended quote  
 * xqdouble implements embedded quote, ''''  
xqstart                 {quote}  
xqdouble                {quote}{quote}  
xqinside                [^']+  
/* $foo$ style quotes ("dollar quoting")  
 * The quoted string starts with $foo$ where "foo" is an optional string  
 * in the form of an identifier, except that it may not contain "$",  
 * and extends to the first occurrence of an identical string.  
 * There is *no* processing of the quoted text.  
 * {dolqfailed} is an error rule to avoid scanner backup when {dolqdelim}  
 * fails to match its trailing "$".  
dolq_start              [A-Za-z\200-\377_]  
dolq_cont               [A-Za-z\200-\377_0-9]  
dolqdelim               \$({dolq_start}{dolq_cont}*)?\$  
dolqfailed              \${dolq_start}{dolq_cont}*  
dolqinside              [^$]+  
/* Double quote  
 * Allows embedded spaces and other special characters into identifiers.  
dquote                  \"  
xdstart                 {dquote}  
xdstop                  {dquote}  
xddouble                {dquote}{dquote}  
xdinside                [^"]+  
/* Unicode escapes */  
uescape                 [uU][eE][sS][cC][aA][pP][eE]{whitespace}*{quote}[^']{quote}  
/* error rule to avoid backup */  
uescapefail             [uU][eE][sS][cC][aA][pP][eE]{whitespace}*"-"|[uU][eE][sS][cC][aA][pP][eE]{whitespace}*{quote}[^']|[uU][eE][sS][cC][aA][pP][eE]{whitespace}*{quote}|[uU][eE][sS][cC][aA][pP][eE]{whitespace}*|[uU][eE][sS][cC][aA][pP]|[uU][eE][sS][cC][aA]|[uU][eE][sS][cC]|[uU][eE][sS]|[uU][eE]|[uU]  
/* Quoted identifier with Unicode escapes */  
xuistart                [uU]&{dquote}  
/* Quoted string with Unicode escapes */  
xusstart                [uU]&{quote}  
/* Optional UESCAPE after a quoted string or identifier with Unicode escapes. */  
xustop1         {uescapefail}?  
xustop2         {uescape}  
/* error rule to avoid backup */  
xufailed                [uU]&  
2、https://docs.microsoft.com/en-us/sql/t-sql/data-types/nchar-and-nvarchar-transact-sql?view=sql-server-2016
3、https://docs.microsoft.com/en-us/sql/t-sql/data-types/data-types-transact-sql?view=sql-server-2016
4、https://www.postgresql.org/docs/devel/static/multibyte.html
5、https://stackoverflow.com/questions/1245217/what-is-the-postgresql-equivalent-to-sql-server-nvarchar
It’s varchar and text, assuming your database is in UNICODE encoding.
If your database is in a non-UNICODE encoding, there is no special datatype that will give you a unicode string - you can store it as a bytea stream, but that will not be a string.
6、各种数据库引擎（postgresql, mysql, oracle, ms sql, sql lite等）的WEB版SQL在线测试运行环境
http://www.sqlfiddle.com/
7、《PostgreSQL 转义、UNICODE、与SQL注入》
digoal’s 大量PostgreSQL文章入口
云数据库PPAS版，是阿里云与EnterpriseDB公司(简称EDB)合作基于PostgreSQL高度兼容Oracle语法的数据库服务，为用户提供易于操作的迁移工具，兼容范围涵盖：PL/SQL、数据类型、高级函数、表分区等。
用户可以直接在阿里云购买PPAS进行使用。
如果在购买PPAS前，想试用一下...
IoT场景，车联网场景，共享单车场景，人的行为位点等，终端实时上报的是孤立的位点，我们需要将其补齐成轨迹。
例如共享单车，下单，开锁，生成订单，骑行，关闭订单，关锁。这个过程有一个唯一的订单号，每次上报的位点会包含时间，订单号，位置。
根据订单号，将点聚合为轨迹。
使用pipelinedb插件，可以实...

使用datediff，对时间或日期相减，得到的间隔，转换为目标单位(日、月、季度、年、小时、秒。。。等)的数值。
DATEDIFF ( datepart, {date|timestamp}, {date|timestamp} )  
select datediff(week,'2009-01-0...