Java Collection(一):Array-based Data Structure

Brief

底层数据结构基于数组的集合,可以通过“下标”快速定位到指定元素;数组的增、删元素时需要进行数据移动拷贝;数据扩容时,需要进行Arrays.copyOf()

Basic Opertions

Indexing:

  • An obvious choice for implementing the list ADT is to use an array, A, where A[i] stores (a reference to) the element with index i.
  • With a representation based on an array A, the get(i) and set(i, e) methods are easy to implement by accessing A[i] (assuming i is a legitimate index).

Indexing

Insertion:

  • In an operation add(i, o), we need to make room for the new element by shifting forward the n - i elements A[i], …, A[n - 1]
  • In the worst case (i = 0), this takes O(n) time

In an add operation, when the array is full, instead of throwing an exception, we can replace the array with a larger one.

Indexing

Removal:

  • In an operation remove(i), we need to fill the hole left by the removed element by shifting backward the n - i - 1 elements A[i + 1], …, A[n - 1]
  • In the worst case (i = 0), this takes O(n) time

Categories

  • Fixed-Array : Array
  • Resizable-Array: ArrayList, Vector(thread-safe)
  • Immutable-Array: ImmutableList(Google Guaua)
Collection thread-safe grow rule init size
ArrayList not 50% of its size 10
Vector yes doubles its array size 10
ImmutableList yes no

通过Collections.synchronizedCollection()可以将ArrayList转换为线程安全的List。

Performance

In an array-based implementation of a dynamic list:

  • The space used by the data structure is O(n)
  • Indexing the element at i takes O(1) time
  • add and remove run in O(n) time in the worst case

Appendix

  • System.arraycopy() VS Arrays.copyOf(): The difference is that Arrays.copyOf does not only copy elements, it also creates a new array. System.arrayCopy copies into an existing array.
  • 为什么concurrent包下没有ArrayList的实现?

MySQL索引(一):正确、有效地使用索引

索引

  1. 什么是索引?
  2. 索引分类
  3. 创建原则
  4. 使用注意
  5. 案例

什么是索引?

MySQL官方定义:“Index (索引)”是帮助MySQL高效获取数据的数据结构。“Index (索引)”本质是一种数据结构,存储在磁盘上。MySQL普遍使用B+Tree作为索引结构。

Linear Searh: tcp_keepalive

B+Tree Search: tcp_keepalive

索引分类

按照索引列的值必须唯一:

  • 普通索引 :允许索引列中插入重复值和空值
  • 唯一索引 :索引列的值必须唯一,但允许为空值。主键索引是一种特殊的唯一索引,不允许有空值。

按照索引列组成的个数:

  • 单列索引:索引只包含单个列,一个表可以有多个单列索引。
  • 组合索引:多个字段组合,创建的索引。遵循“最左前缀原则

其他索引:

  • 全文索引(FULLTEXT):在定义索引列上支持值得全文查找。

创建原则

  1. 2000条以上建议建索引
  2. 组合索引:最左前缀原则,“选择性”高的列排在前面。
  3. “选择性”低的列不建议建立索引。Index Selectivity = Cardinality / #T
  4. 尽量地扩展原有索引,不建议新建索引
  5. 字符串尽量使用短索引

使用注意

  1. 隐式类型转换,无法命中索引,扫描全表(建议按照字段类型传值)。
  2. 参与计算(函数和表达式)的列,不走索引。
  3. “负向查询”不走索引。
  4. 前导模糊查询不能走索引。如: like ‘%xxx’
Read More...

Key In Http Protocol (二):'Connection'

Brief

HTTP works with request-response: client connects to server, performs a request and gets a response. Normally, the connection to an HTTP server is closed after each response. With HTTP keep-alive you keep the underlying TCP connection open until “certain criteria” are met.

“Connection:keep-alive” is a method to allow the same tcp connection for HTTP conversation instead of opening a new one with each new request.

More simply put, it is a communication between the web server and the web browser that says “you can grab more than just one file at a time“. Interaction diagram as follows:

tcp_keepalive

That enable HTTP ‘keep-alive’ is callded ‘HTTP persistent connection‘. See HTTP persistent connection for some advantages , such as:

  • Lower CPU and memory usage : by opening and closing fewer TCP connections, CPU time is saved in routers and hosts (clients, servers, proxies, gateways, tunnels, or caches), and memory used for TCB (TCP control blocks) can be saved in hosts.
  • Reduced latency : latency on subsequent requests is reduced since there is no time spent in TCP’s connection opening handshake.
  • HTTP pipelining : HTTP requests and responses can be pipelined on a connection. Pipelining allows a client to make multiple requests without waiting for each response, allowing a single TCP connection to be used much more efficiently, with much lower elapsed time.
Read More...