我在JDBC中使用mysql.
我有一个大型示例表,其中包含630万行,我正在尝试执行有效的选择查询.见下文:
我在表上创建了三个附加索引,见下文:
执行这样的SELECT
查询SELECT latitude, longitude FROM 3dag WHERE
timestamp BETWEEN "+startTime+" AND "+endTime+" AND HourOfDay=4 AND DayOfWeek=3"
的运行时间非常高,为256356 ms,或略高于4分钟.我对同一个查询的解释给了我:
我检索数据的代码如下:
Connection con = null; PreparedStatement pst = null; Statement stmt = null; ResultSet rs = null; String url = "jdbc:mysql://xxx.xxx.xxx.xx:3306/testdb"; String user = "bigd"; String password = "XXXXX"; try { Class.forName("com.mysql.jdbc.Driver"); con = DriverManager.getConnection(url, user, password); String query = "SELECT latitude, longitude FROM 3dag WHERE timestamp BETWEEN "+startTime+" AND "+endTime+" AND HourOfDay=4 AND DayOfWeek=3"; stmt = con.prepareStatement("SELECT latitude, longitude FROM 3dag WHERE timestamp>=" + startTime + " AND timestamp<=" + endTime); stmt = con.createStatement(java.sql.ResultSet.TYPE_FORWARD_ONLY, java.sql.ResultSet.CONCUR_READ_ONLY); stmt.setFetchSize(Integer.MIN_VALUE); rs = stmt.executeQuery(query); System.out.println("Start"); while (rs.next()) { int tempLong = (int) ((Double.parseDouble(rs.getString(2))) * 100000); int x = (int) (maxLong * 100000) - tempLong; int tempLat = (int) ((Double.parseDouble(rs.getString(1))) * 100000); int y = (int) (maxLat * 100000) - tempLat; if (!(y > matrix.length) || !(y < 0) || !(x > matrix[0].length) || !(x < 0)) { matrix[y][x] += 1; } } System.out.println("End"); JSONObject obj = convertToCRS(matrix); return obj; }catch (ClassNotFoundException ex){ Logger lgr = Logger.getLogger(Database.class.getName()); lgr.log(Level.SEVERE, ex.getMessage(), ex); return null; } catch (SQLException ex) { Logger lgr = Logger.getLogger(Database.class.getName()); lgr.log(Level.SEVERE, ex.getMessage(), ex); return null; } finally { try { if (rs != null) { rs.close(); } if (pst != null) { pst.close(); } if (con != null) { con.close(); } } catch (SQLException ex) { Logger lgr = Logger.getLogger(Database.class.getName()); lgr.log(Level.WARNING, ex.getMessage(), ex); return null; } }
删除while(rs.next())
循环中的每一行都会给我带来同样可怕的运行时间.
我的问题是我可以做些什么来优化这种类型的查询?我很好奇.setFetchSize()
这里应该有什么最佳值.文档显示INTEGER.MIN_VALUE导致逐行获取,这是正确的吗?
任何帮助表示赞赏.
编辑 在时间戳,DayOfWeek和HourOfDay上创建新索引后,我的查询运行速度提高1分钟并解释给我: