All Questions
10
questions
0
votes
1
answer
711
views
Extract year from timestamp in hive
I am writing the query to show the data entries for a specific year. Date is stored in dd/mm/yyyy hh:mm:ss.(Date TIMESTAMP - e.g. 12/2/2014 0:00:00).
I am trying to display the two columns(name, ...
0
votes
1
answer
419
views
How can I get the actual data size per row in Hive SQL?
It is possible to calculate what is the actual data size per row in Hive SQL?
I have found this DBA question for MS SQL Server. I am not able to translate the accepted answer to Hive SQL.
I'm ...
1
vote
2
answers
133
views
How can I store, retrieve (and perform munging)large csv files with python.?
I have a large csv file of size ~ 5-6GB (million of rows). So pandas cannot handle it (it gives memory error as my ram capacity is 2GB). I want to use Hadoop on it (i.e., store block of each file on ...
0
votes
1
answer
373
views
How to perform Denormalization in Hbase?
We are trying to migrate our existing RDBMS(Sql Database) system to hadoop. We are planning to use hbase for the same. But we are not getting how to denormalize sql data to store it in hbase column ...
4
votes
2
answers
17k
views
Compare two tables in HIVE
I have 3 tables in hive:
Control_table, with known data
New_table, with data to check
Result_table, table where records with different values in new_table then control_table are inserted to
All ...
5
votes
1
answer
12k
views
How to compare two tables and return rows with difference with HIVE
So lets say I have a table with about 180 columns and 100 records.
This table is backed up into temporary table and original one is removed.
After this migration (change) is run on a pipeline which ...
0
votes
1
answer
650
views
Getting probability density graph & k-means clustering with 300 million rows
The DBMS I use is MySQL(MariaDB).
The table scheme is as below:
CREATE TABLE MyTable (
ID INT PRIMARY KEY,
TEXT VARCHAR(200),
VALUE DECIMAL(15,2) )
The table has 300 million rows or more....
0
votes
2
answers
297
views
Real time queries in MongoDB for different criteria and processing the result
New to Mongodb. Is Mongodb efficient for real time queries where the values for the criteria changes every time for my query. Also there will be some aggregation of the resultset before sending the ...
2
votes
1
answer
220
views
Is there any abstraction layer to work with GFS or HDFS? [closed]
The SQL and NOSQL databases are used by facebook
1.Whether it uses GFS or HDFS or BOTH or some other?
2.What are the different Abstraction application layer available to work on HDFS AND GFS ??
3....
8
votes
4
answers
7k
views
Advanced queries in HBase
Given the following HBase schema scenario (from the official FAQ)...
How would you design an Hbase table
for many-to-many association between
two entities, for example Student and
Course?
...