当前位置: 移动技术网 > IT编程>开发语言>Java > 使用bitset实现毫秒级查询(实例讲解)

使用bitset实现毫秒级查询(实例讲解)

2019年07月19日  | 移动技术网IT编程  | 我要评论

前言

因为业务要求api的一次请求响应时间在10ms以内,所以传统的数据库查询操作直接被排除(网络io和磁盘io)。通过调研,最终使用了bieset,目前已经正常运行了很久

bitset介绍

看jdk中的解释简直一头雾水,用我自己的理解概括一下

1.bitset的内部实现是long数组
2.set中每一个位的默认值为false(0)
3.bitset长度按需增长
4.bitset非线程安全

bitset关键方法分析

/**
 * sets the bit at the specified index to {@code true}.
 *
 * @param bitindex a bit index
 * @throws indexoutofboundsexception if the specified index is negative
 * @since jdk1.0
 */
public void set(int bitindex) {
 if (bitindex < 0)
  throw new indexoutofboundsexception("bitindex < 0: " + bitindex);

 int wordindex = wordindex(bitindex);
 expandto(wordindex);

 words[wordindex] |= (1l << bitindex); // restores invariants

 checkinvariants();
}

设置指定“位”为true,bitindex参数为非负整数。假设我们执行以下代码,观察上面代码中worindex,words[wordindex]值的变化

bitset bs = new bitset()
bs.set(0);
bs.set(1);
bs.set(2);
bs.set(3);
bs.set(4);

bitindex wordindex words[wordindex] words[wordindex]二进制表示
0 0 1 0001
1 0 3 0011
2 0 7 0111
3 0 15 1111
4 0 31 0001 1111

通过上表,我们可以很清晰的根据bitindex和words[wordindex]二进制值的对应关系,得到一个结论,即:bitset中每一个long可以表示64个非负整数在bitset中存在与否。例如:0001可以表示整数0在bitset中存在,1111可以表示整数3,2,1,0在bitset中存在。

进入正题,实现bitset毫秒级查询

想象一个场景,我们有一张user表

create table `user` (
 `id` int(11) not null auto_increment,
 `name` varchar(255) not null,
 `address` varchar(255) not null comment '地址',
 `gender` varchar(10) not null comment '性别',
 `age` varchar(10) not null,
 primary key (`uid`)
) engine=innodb auto_increment=0 default charset=utf8;

假设我们要查询“北京市18岁的女生”,那么对应的sql如下:

select * from `user` where address='北京' and age='18' and gender='girl';

如何使用bitset实现同样的查询呢?

1.将user表数据加载进内存中

2.为user表建立address,age,gender维度的bitset索引

3.根据索引查询数据

1.将user表数据加载进内存中

将user表从数据库读取出来存入list

user实体类:

public class user implements cloneable {
 private string name;
 private string address;
 private string gender;
 private string age;

 @override
 public string tostring() {
  return "user [name=" + name + ", address=" + address + ", gender=" + gender + ", age=" + age + "]";
 }

 @override
 public user clone() {
  user user = null;
  try {
   user = (user) super.clone();
  } catch (clonenotsupportedexception e) {
   e.printstacktrace();
  }
  return user;
 }
 //省略get set 方法。。。

2.建立索引

创建bitset索引模型类

public class bitsetindexmodel {
 private string type;//索引类型:address,age,gender
 private concurrenthashmap<string, integer> map;//索引类型和bitset在bslist中下标的映射关系
 private list<string> list;//索引类型的值集合,例如gender:girl,boy
 private list<bitset> bslist;//bitset集合

 public bitsetindex() {
  
 }

 public bitsetindexmodel(string type) {
  this.type = type;
  map = new concurrenthashmap<string, integer>();
  list = new arraylist<string>();
  bslist = new arraylist<bitset>();
 }
 
 public string gettype() {
  return type;
 }

 public void settype(string type) {
  this.type = type;
 }

 public map<string, integer> getmap() {
  return map;
 }

 public void setmap(concurrenthashmap<string, integer> map) {
  this.map = map;
 }

 public list<string> getlist() {
  return list;
 }

 public void setlist(list<string> list) {
  this.list = list;
 }

 public list<bitset> getbslist() {
  return bslist;
 }

 public void setbslist(list<bitset> bslist) {
  this.bslist = bslist;
 }

 /**
  *
  * @param str
  * @param i
  */
 public void createindex(string str, int i) {
  bitset bitset = null;
  //获取‘str'对应的bitset在bslist中的下标
  integer index = this.getmap().get(str);
  if (index != null) {
   bitset = this.getbslist().get(index);
   if (bitset == null) {
    bitset = new bitset();
    this.getbslist().add(index, bitset);
   }
   bitset.set(i, true);// 将str对应的位置为true,true可省略
  } else {
   bitset = new bitset();
   list<string> list = this.getlist();
   list.add(str);
   index = list.size() - 1;
   bitset.set(i, true);
   this.getbslist().add(bitset);
   this.getmap().put(str, index);
  }
 }
 
 /**
  * 从entity里拿出符合条件的bitset
  *
  * @param str
  * @return
  */
 public bitset get(string str) {
  bitset bitset = null;
  str = str.tolowercase();
  integer index = this.getmap().get(str);

  if (index != null) {
   bitset = this.getbslist().get(index);
  } else {
   bitset = new bitset();
  }
  return bitset;
 }

 /**
  * bitset的与运算
  *
  * @param str
  * @param bitset
  * @return
  */
 public bitset and(string str, bitset bitset) {
  if (str != null) {
   str = str.tolowercase();
   if (bitset != null) {
    bitset.and(get(str));
   } else {
    bitset = new bitset();
    bitset.or(get(str));
   }
  }
  return bitset;
 }

 /**
  * bitset的或运算
  *
  * @param str
  * @param bitset
  * @return
  */
 public bitset or(string str, bitset bitset) {
  if (str != null) {
   str = str.tolowercase();
   if (bitset != null) {
    bitset.or(get(str));
   } else {
    bitset = new bitset();
    bitset.or(get(str));
   }
  }
  return bitset;
 }

 /**
  * 获取bitset值为true的 即 把 bitset翻译为list的索引
  *
  * @param bitset
  * @return
  */
 public static list<integer> getrealindexs(bitset bitset) {
  list<integer> indexs = new arraylist<integer>();
  if (bitset != null) {
   int i = bitset.nextsetbit(0);
   if (i != -1) {
    indexs.add(i);
    for (i = bitset.nextsetbit(i + 1); i >= 0; i = bitset.nextsetbit(i + 1)) {
     int endofrun = bitset.nextclearbit(i);
     do {
      indexs.add(i);
     } while (++i < endofrun);
    }
   }
  }

  return indexs;
 }
 
}

为每一个user对象创建address,gender,age维度索引

public class userindexstore {

 private static final string address = "address";
 private static final string gender = "gender";
 private static final string age = "age";
 private bitsetindexmodel address;
 private bitsetindexmodel gender;
 private bitsetindexmodel age;
 private concurrenthashmap<integer, user> usermap;//存储所有的user数据

 public static final userindexstore instance = getinstance();

 private userindexstore() {
  init();
 }

 public static userindexstore getinstance() {
  return userindexstoreholder.instance;
 }

 private static class userindexstoreholder {
  private static userindexstore instance = new userindexstore();
 }

 private void init() {
  this.address = new bitsetindexmodel(address);
  this.gender = new bitsetindexmodel(gender);
  this.age = new bitsetindexmodel(age);
  usermap = new concurrenthashmap<integer, user>();
 }

 /**
  * 构建索引
  * @param users 
  */
 public void createindex(list<user> users) {
  if (users != null && users.size() > 0) {
   for (int index = 0; index < users.size(); index++) {
    user user = users.get(index);
    getaddress().update(user.getaddress(), index);
    getgender().update(user.getgender(), index);
    getage().update(user.getage(), index);
    this.usermap.put(index, user);
   }
  }
 }

 public bitset query(string address, string gender, string age) {
  bitset bitset = null;
  bitset = getaddress().and(address, bitset);
  bitset = getgender().and(gender, bitset);
  bitset = getage().and(age, bitset);
  return bitset;
 }

 public user finduser(integer index) {
  user user = this.usermap.get(index);
  if (user != null) {
   return user.clone();//可能会对user做修改操作,要保证内存原始数据不变
  }
  return null;
 }

 public bitsetindexmodel getaddress() {
  return address;
 }

 public void setaddress(bitsetindexmodel address) {
  this.address = address;
 }

 public bitsetindexmodel getgender() {
  return gender;
 }

 public void setgender(bitsetindexmodel gender) {
  this.gender = gender;
 }

 public bitsetindexmodel getage() {
  return age;
 }

 public void setage(bitsetindexmodel age) {
  this.age = age;
 }

}

3.测试bitset

public class bitsettest {

 public static void main(string[] args) {
  list<user> users = builddata();
  userindexstore.getinstance().createindex(users);
  executorservice executorservice = executors.newfixedthreadpool(20);
  int num = 2000;
  long begin1 = system.currenttimemillis();
  for (int i = 0; i < num; i++) {
   runnable syncrunnable = new runnable() {
    @override
    public void run() {
     list<integer> indexs = bitsetindexmodel.getrealindexs(userindexstore.getinstance().query("北京", "girl", "18"));
     for (integer index : indexs) {
      userindexstore.getinstance().finduser(index);
     }
    }
   };
   executorservice.execute(syncrunnable);
  }
  executorservice.shutdown();
  while (true) {
   try {
    if (executorservice.awaittermination(1, timeunit.seconds)) {
     system.out.println("单次查询时间为:" + (system.currenttimemillis() - begin1) / num + "ms");
     break;
    }
   } catch (interruptedexception e) {
    e.printstacktrace();
   }
  }
 }

 private static list<user> builddata() {
  string[] addresss = { "北京", "上海", "深圳" };
  string[] ages = { "16", "17", "18" };
  list<user> users = new arraylist<>();
  for (int i = 0; i < 200000; i++) {
   user user = new user();
   user.setname("name" + i);
   int rand = threadlocalrandom.current().nextint(3);
   user.setaddress(addresss[threadlocalrandom.current().nextint(3)]);
   user.setgender((rand & 1) == 0 ? "girl" : "boy");
   user.setage(ages[threadlocalrandom.current().nextint(3)]);
   users.add(user);
  }
  return users;
 }

}

测试结果(查询2w次):

数据量(users.size()) | 并发数 | 平均查询时间

---|---|---
20w | 10 | 1ms
50w | 20 | 3ms
100w| 50 | 9ms

测试机为thinkpad x240 i5 8g内存

4.总结

==优点==:

通过测试发现随着数据量的增大和并发数的提高,平均耗时并没有明显升高,并且响应时间都在10ms以内

==缺点==:

1.不适合数据量太大的情况,因为需要把数据全部加载进内存

2.不适合复杂查询

3.不适合对name,id等唯一值做查询

后记

因为我们的查询业务比较简单,唯一的要求是速度,并且数据量也不大,每张表的数据量都不超过100w,所以使用这种方式比较合适。

在本篇文章中只谈到了如何创建索引,以及最基本的查询,在下一篇中我会继续说明如何更新索引,以及一些复杂查询,比如<,>,between and。

以上这篇使用bitset实现毫秒级查询(实例讲解)就是小编分享给大家的全部内容了,希望能给大家一个参考,也希望大家多多支持移动技术网。

如对本文有疑问, 点击进行留言回复!!

相关文章:

验证码:
移动技术网