Newest 'sql+nosql+hadoop' Questions

0 votes

1 answer

711 views

Extract year from timestamp in hive

I am writing the query to show the data entries for a specific year. Date is stored in dd/mm/yyyy hh:mm:ss.(Date TIMESTAMP - e.g. 12/2/2014 0:00:00). I am trying to display the two columns(name, ...

NoobCoder123

5

asked Jul 2, 2021 at 20:46

0 votes

1 answer

419 views

How can I get the actual data size per row in Hive SQL?

It is possible to calculate what is the actual data size per row in Hive SQL? I have found this DBA question for MS SQL Server. I am not able to translate the accepted answer to Hive SQL. I'm ...

Ashkan

1,673

asked May 27, 2021 at 18:50

1 vote

2 answers

133 views

How can I store, retrieve (and perform munging)large csv files with python.?

I have a large csv file of size ~ 5-6GB (million of rows). So pandas cannot handle it (it gives memory error as my ram capacity is 2GB). I want to use Hadoop on it (i.e., store block of each file on ...

Vipul Singh

37

asked Sep 28, 2017 at 5:51

0 votes

1 answer

373 views

How to perform Denormalization in Hbase?

We are trying to migrate our existing RDBMS(Sql Database) system to hadoop. We are planning to use hbase for the same. But we are not getting how to denormalize sql data to store it in hbase column ...

Bunny

439

asked Sep 8, 2016 at 12:06

4 votes

2 answers

17k views

Compare two tables in HIVE

I have 3 tables in hive: Control_table, with known data New_table, with data to check Result_table, table where records with different values in new_table then control_table are inserted to All ...

Jakub Zak

1,232

asked Oct 16, 2014 at 8:47

5 votes

1 answer

12k views

How to compare two tables and return rows with difference with HIVE

So lets say I have a table with about 180 columns and 100 records. This table is backed up into temporary table and original one is removed. After this migration (change) is run on a pipeline which ...

Jakub Zak

1,232

asked Oct 15, 2014 at 9:48

0 votes

1 answer

650 views

Getting probability density graph & k-means clustering with 300 million rows

The DBMS I use is MySQL(MariaDB). The table scheme is as below: CREATE TABLE MyTable ( ID INT PRIMARY KEY, TEXT VARCHAR(200), VALUE DECIMAL(15,2) ) The table has 300 million rows or more....

Keith Park

607

asked Aug 11, 2014 at 14:45

0 votes

2 answers

297 views

Real time queries in MongoDB for different criteria and processing the result

New to Mongodb. Is Mongodb efficient for real time queries where the values for the criteria changes every time for my query. Also there will be some aggregation of the resultset before sending the ...

user203617

533

asked Feb 25, 2014 at 16:45

2 votes

1 answer

220 views

Is there any abstraction layer to work with GFS or HDFS? [closed]

The SQL and NOSQL databases are used by facebook 1.Whether it uses GFS or HDFS or BOTH or some other? 2.What are the different Abstraction application layer available to work on HDFS AND GFS ?? 3....

JAVA Beginner

481

asked Dec 19, 2013 at 6:49

8 votes

4 answers

7k views

Advanced queries in HBase

Given the following HBase schema scenario (from the official FAQ)... How would you design an Hbase table for many-to-many association between two entities, for example Student and Course? ...

Teflon Ted

8,806

asked Sep 16, 2009 at 23:50

Collectives™ on Stack Overflow

All Questions

Extract year from timestamp in hive

How can I get the actual data size per row in Hive SQL?

How can I store, retrieve (and perform munging)large csv files with python.?

How to perform Denormalization in Hbase?

Compare two tables in HIVE

How to compare two tables and return rows with difference with HIVE

Getting probability density graph & k-means clustering with 300 million rows

Real time queries in MongoDB for different criteria and processing the result

Is there any abstraction layer to work with GFS or HDFS? [closed]

Advanced queries in HBase

Hot Network Questions

Collectives™ on Stack Overflow

All Questions

Related Tags