Oracle MySQL NoSQL DBA|数据管理,架构,监控与性能优化 --- NinGoo.net
NinGoo's blog
  • 初略的翻了一下《1984》,直觉的浑身冷汗。 1 day ago
  • 其实更愿意坚持,有时候也是不得以。 @mujiang 1 day ago
  • 淘宝第一台小型机上的核心库的即将下线,完成了其五年的历史使命。接下来会有越来越多小型机存储的下线,pc化,水平化,可扩展高可用的趋势,不可阻挡。 2 days ago
  • 卓越的书,昨天就已经收到了。但是今天才收到卓越的邮件,说库房已经安排发货,预计明天可以送达,请您耐心等待、保持电话畅通并注意查收。这是什么系统啊,囧。 2 days ago
  • 昨天在卓越下单买的几本书上午就送到了,不到24小时,效率不错。 3 days ago

厦门outing

8.14~8.16,技术保障部厦门outing。算起来这是到到淘宝以后的第三次大部门outing,一次临安,一次安吉,这次总算是跨出了浙江省的范围,来到美丽的海边城市厦门。来回都是动车,六个小时的车程,100多人的队伍,打牌聊天,到也不觉旅途的漫长。

淘宝DBA团队有了更多的新成员,难得的在鼓浪屿留下一张合影,可惜的是还有几位没能聚齐。
淘宝DBA厦门outing

虽然鼓浪屿上的建筑各具特色,夏门大学的校园风景怡人,南普陀的放生池乌龟成群,不过厦门的大海有点让人失望。
厦门海边小男孩

当然,海水一般,海鲜还是不一般的,15号晚上去了传说中的小眼镜,生意好得出奇,在外面露天排号差不多一个小时才轮到,好在味道还是对得起传说中的大名的。路上还有个小插曲,因为人多,需要打两辆车,另外几个同事打的一辆出租车,一说去小眼镜,师傅说小眼镜没空位,就被赶下车了,囧。厦门这边的服务水平有待提高,一不小心就可能碰个钉子。

Cassandra 0.7 值得期待

在Cassandra的wiki上,很早就有0.7的一些特性描述,其中很有些吸引人,而8月13号Cassandra 0.7 beta1版本终于发布了,这里可以下载。

个人比较关心的几个主要的新特性:

1. Key Space和Column Family定义可以在线增改,不再需要停集群修改配置文件了。
2. 支持secondary index,可以对column建索引,通过接口get_indexed_slices实现针对column的查询。
3. 支持truncate一个column family。
4. 可以针对keyspace设置replica_placement_strategy和replication_factor。
5. Row cache提升了8倍的读性能。之前版本的测试中,Cassandra写性能令人印象深刻,读性能则不如人意。
6. 支持hadoop格式的输出,可以使得数据仓库更容易从Cassandra中抽取数据。

另外,配置文件从xml改成了yaml格式的,读起来更顺畅些。代码里还有很多细节的改进,有时间需要慢慢去看了。期待0.7版本尽快GA。

0.7.0
=====

Features
--------
    - Row keys are now bytes: keys stored by versions prior to 0.7.0 will be
      returned as UTF-8 encoded bytes. OrderPreservingPartitioner and
      CollatingOrderPreservingPartitioner continue to expect that keys contain
      UTF-8 encoded strings, but RandomPartitioner no longer expects strings.
    - A new ByteOrderedPartitioner supports bytes keys with arbitrary content,
      and orders keys by their byte value.
    - Truncate thrift method allows clearing an entire ColumnFamily at once
    - DatacenterShardStrategy is ready for use, enabling
      ConsitencyLevel.DCQUORUM and DCQUORUMSYNC.  See comments in
      `cassandra.yaml.`
    - row size limit increased from 2GB to 2 billion columns
    - Hadoop OutputFormat support
    - Streaming data for repair or node movement no longer requires
      anticompaction step first
    - keyspace is per-connection in the thrift API instead of per-call
    - optional round-robin scheduling between keyspaces for multitenant
      clusters
    - dynamic endpoint snitch mitigates the impact of impaired nodes
    - significantly faster reads from row cache
    - introduced IntegerType that is both faster than LongType and
      allows integers of both less and more bits than Long's 64

Configuraton
------------
    - Configuration file renamed to cassandra.yaml and log4j.properties to
      log4j-server.properties
    - Added 'bin/config-converter' to convert existing storage-conf.xml or
      cassandra.xml files to a cassandra.yaml file. When executed, it will
      create a cassandra.yaml file in any directory containing a matching
      xml file.
    - The ThriftAddress and ThriftPort directives have been renamed to
      RPCAddress and RPCPort respectively.
    - The keyspaces defined in cassandra.yaml are ignored on startup as a
      result of CASSANDRA-44.  A JMX method has been exposed in the
      StorageServiceMBean to force a schema load from cassandra.yaml. It
      is a one-shot affair though and you should conduct it on a seed node
      before other nodes. Subsequent restarts will load the schema from the
      system table and attempts to load the schema from YAML will be ignored.
      You shoud only have to do this for one node since new nodes will receive
      schema updates on startup from the seed node you updated manually.
    - EndPointSnitch was renamed to RackInferringSnitch.  A new SimpleSnitch
      has been added.
    - RowWarningThresholdInMB replaced with in_memory_compaction_limit_in_mb
    - GCGraceSeconds is now per-ColumnFamily instead of global
    - Configuration of DatacenterShardStrategy is now a part of the keyspace
      definition using the strategy_options attribute.
      The datacenter.properties file is no longer used.

JMX
---
    - StreamingService moved from o.a.c.streaming to o.a.c.service
    - GMFD renamed to GOSSIP_STAGE
    - {Min,Mean,Max}RowCompactedSize renamed to {Min,Mean,Max}RowSize
      since it no longer has to wait til compaction to be computed

Thrift API
----------
    - Row keys are now 'bytes': see the Features list.
    - The return type for login() is now AccessLevel.
    - The get_string_property() method has been removed.
    - The get_string_list_property() method has been removed.

Other
-----
    - If extending AbstractType, make sure you follow the singleton pattern
      followed by Cassandra core AbstractType extensions.
      e.g. BytesType has a variable called 'instance' and an empty constructor
      with default access

0.7.0-beta1
 * sstable versioning (CASSANDRA-389)
 * switched to slf4j logging (CASSANDRA-625)
 * access levels for authentication/authorization (CASSANDRA-900)
 * add ReadRepairChance to CF definition (CASSANDRA-930)
 * fix heisenbug in system tests, especially common on OS X (CASSANDRA-944)
 * convert to byte[] keys internally and all public APIs (CASSANDRA-767)
 * ability to alter schema definitions on a live cluster (CASSANDRA-44)
 * renamed configuration file to cassandra.xml, and log4j.properties to
   log4j-server.properties, which must now be loaded from
   the classpath (which is how our scripts in bin/ have always done it)
   (CASSANDRA-971)
 * change get_count to require a SlicePredicate. create multi_get_count
   (CASSANDRA-744)
 * re-organized endpointsnitch implementations and added SimpleSnitch
   (CASSANDRA-994)
 * Added preload_row_cache option (CASSANDRA-946)
 * add CRC to commitlog header (CASSANDRA-999)
 * removed multiget thrift method (CASSANDRA-739)
 * removed deprecated batch_insert and get_range_slice methods (CASSANDRA-1065)
 * add truncate thrift method (CASSANDRA-531)
 * http mini-interface using mx4j (CASSANDRA-1068)
 * optimize away copy of sliced row on memtable read path (CASSANDRA-1046)
 * replace constant-size 2GB mmaped segments and special casing for index
   entries spanning segment boundaries, with SegmentedFile that computes
   segments that always contain entire entries/rows (CASSANDRA-1117)
 * avoid reading large rows into memory during compaction (CASSANDRA-16)
 * added hadoop OutputFormat (CASSANDRA-1101)
 * efficient Streaming (no more anticompaction) (CASSANDRA-579)
 * split commitlog header into separate file and add size checksum to
   mutations (CASSANDRA-1179)
 * avoid allocating a new byte[] for each mutation on replay (CASSANDRA-1219)
 * revise HH schema to be per-endpoint (CASSANDRA-1142)
 * add joining/leaving status to nodetool ring (CASSANDRA-1115)
 * allow multiple repair sessions per node (CASSANDRA-1190)
 * add dynamic endpoint snitch (CASSANDRA-981)
 * optimize away MessagingService for local range queries (CASSANDRA-1261)
 * make framed transport the default so malformed requests can't OOM the
   server (CASSANDRA-475)
 * significantly faster reads from row cache (CASSANDRA-1267)
 * take advantage of row cache during range queries (CASSANDRA-1302)
 * make GCGraceSeconds a per-ColumnFamily value (CASSANDRA-1276)
 * keep persistent row size and column count statistics (CASSANDRA-1155)
 * add IntegerType (CASSANDRA-1282)
 * page within a single row during hinted handoff (CASSANDRA-1327)
 * push DatacenterShardStrategy configuration into keyspace definition,
   eliminating datacenter.properties. (CASSANDRA-1066)
 * optimize forward slices starting with '' and single-index-block name
   queries by skipping the column index (CASSANDRA-1338)
 * streaming refactor (CASSANDRA-1189)

btrfs文件系统

最近一位同事在美国参加LSF(Linux Storage and Filesystem Summit)回来,分享了他的总结,非常的精彩。这是一个没有演讲、没有照片、没有录像的讨论为主的会议,因此参会者基本上都是比较资深的Linux内核和文件系统的开发者,参加这个讨论会的有不少中国人(华人),相比起我们这些只是理解使用某个数据库产品的DBA来说,这是值得我们敬仰的一群人。也许在不远的将来,淘宝的DBA团队也不仅仅是一个维护某种数据库产品的团队,也能在开源数据库领域内贡献自己的力量,这也正是我们接下来招聘的思路所在。淘宝DBA团队,也已经建立了开源数据库研究小组,当前主要是MySQL/InnoDB,如果你是一位优秀的c/c++开发者,如果你对MySQL/InnoDB等开源数据库在大规模大并发量的网站应用有兴趣,如果你想有一天,能够为开源数据库提供Patch,欢迎加入我们,不管之前你有没有DBA经验,也不管你还是个应届毕业生(我的邮箱jiangfeng # taobao.com,你懂的)。

扯远了,这里我想说的是,从这个报告我注意到了btrfs文件系统。前一段时间,也一直在考虑,在Linux上部署MySQL数据库,究竟什么样的文件系统会比较适合?或者说什么文件系统适合不同的数据库环境?之前的思路,更多的集中在xfs, ext4等已经比较熟悉的文件系统上,上周去北京出差的时候,本来还想找前面的这位同事探讨一下ext4是否适合在数据库产品环境中部署,而当时他正好在美国参加LSF,因此也错过了机会。不过从他的这个总结来看,ext4已经达到了产品级别的稳定性,应该可以尝试,接下来可能会安排一些测试,这是后话。

对于btrfs,简单的从字面来理解,就是以B-Tree作为组织元数据的数据结构,相比于ext2/ext3使用线性表来保存目录等结构,好处在于查找,插入和删除操作都很高效。btrfs另外一个比较重要的特性是基于extent来分配管理文件空间,而不是传统的block。看到这里,作为DBA,其实可以很明显的看到,这和Oracle数据库组织管理数据的思路何其一致(如果寻根问祖,发现这个btrfs根本就是由Oracle最初在2007年开放出来的项目,本是同根生啊)。关于btrfs更详细的信息,可以参考这里,还有这里。另外,btrfs对于目前热点的SSD磁盘也提供了特别的支持,因此被认为是Linux未来文件系统的趋势所在,虽然目前还是开发版本,但值得关注。

实际上可以做一个简单的类比,就能对btrfs有一个更加感性的认识,可以将btrfs看作是Linux版本的ZFS,其设计思路和特性,很多都是借鉴自ZFS,其主要特性如下:

  • Copy on Write
  • Extent based file storage (2^64 max file size)
  • Space efficient packing of small files
  • Space efficient indexed directories
  • Dynamic inode allocation
  • Writable snapshots
  • Subvolumes (separate internal filesystem roots)
  • Object level mirroring and striping
  • Checksums on data and metadata (multiple algorithms available)
  • Compression
  • Integrated multiple device support, with several raid algorithms
  • Online filesystem check
  • Very fast offline filesystem check
  • Efficient incremental backup and FS mirroring
  • Online filesystem defragmentation

Cassandra运维之道 v0.2

最近几个尝试性的Cassandra应用中碰到了一些问题,在查找问题的过程中发现之前有些理解不到位,或者有偏差遗漏的地方,在v0.1的基础上,修改补充了小部分内容。从实际应用来看,Cassandra节点的稳定性还有很多工作要做,而实际系统的运维也还有很多的细节需要逐步规范下来。此PPT中有错漏或者待补充完善的地方,也欢迎大家指正。


常用标签: oracle life MySQL Oracle11g blog dba 新特性 oow2009 oow linux

最新评论 | Recent comments

  • guanghui: 我很早就开始关注zfs,然后自然也关...
  • thomas zhang: 这个bug是够恶心的 官方11.2.0.2 Release...
  • 巫师勋: 这里面有一些理念和ASM很像啊,确实...
  • 大头刚: EXT4 对内核的要求太高,大规模的迁...
  • ruochen: ext4的稳定性现在应该说已经很不错了...
  • Rain@DNA: 是的,目前的一些主流版本里已经集...
  • icavy: 这里探讨 404错...
  • 驾照: 哥们挺厉害的。我练了好久,倒桩和...
  • icavy: 现在不用穿墙也可以...
  • NinGoo: @supsyg 这个应该是tnsname本身有问题吧...
  • supsyg: 试了一下,速度还不错,但tns连接好...
  • 再现9527: 多开开就好,现在的驾校培养出来很...
  • 曾经的阿飞: sysbench可以使用oltp-table-size指定自动...
  • 曾经的阿飞: 你说concurrency参数是不是和sysbench的num...
  • ppg: 有车之人,羡慕啊,正努力赚钱中.....