Posted by
Message
Hu_Flung_Pu
USA Fan
Central, LA
Member since Jan 2013
19607 posts
 Online 

Need help with a database comparison
Have two databases with roads and lengths. I want to consolidate the duplicates in each database (Ex. Two 1300 Main St with two different distances) and compare it with the other table and see if the distances match.

Example
Database A
1300 Main St / .5 miles
1300 Main St / .6 miles

Need 1300 Main St / 1.1 miles

Database B
1300 Main St / .5 miles
1300 Main St / .5 miles

Need 1300 Main St / 1.0 miles

Then compare the two to see a .1 mile difference.

There are over 400k records in one database and 200k in the other.


CAD703X
Toledo Fan
Liberty Island
Member since Jul 2008
65475 posts

re: Need help with a database comparison
create a new table and do a calculation query from DB-A and DB-B and put the consolidated results into a new row.

my sql sucks but you'd want the sum of all UNIQUE results (assuming the .5 and .6 are in separate records)

if it was me i'd get an export from both DBs and put them into a spreadsheet and you could knock this out pretty quick.


Hu_Flung_Pu
USA Fan
Central, LA
Member since Jan 2013
19607 posts
 Online 

re: Need help with a database comparison
Coo Coo Coo Coo Cool.....Now like I'm dumb


CAD703X
Toledo Fan
Liberty Island
Member since Jul 2008
65475 posts

re: Need help with a database comparison
quote:

Coo Coo Coo Coo Cool.....Now like I'm dumb


ok i may get a little technical (i know just enough to be a bad hack at this stuff) and someone else smarter than me will come along

- are you really talking 2 different databases or 2 different tables in a single DB?
- what DB? cloud/aurora/sql server/oracle?
- you should be able to get an export in any case which may be easiest for what you're doing (.xlxs export if possible, tab-delinated if not). the new version of excel supports a million rows so you could pull DB1 into a worksheet and DB2 into a second worksheet so its easier to calculate the difference between db1's main-street and db2's main street.
- since the numbers dont add up (400k in one db, 200k in the other) what do you want to do for those that don't exist in both places? only display the streets where the mileage has a discreprency?

- once its in excel, its pretty much cake. you might even be able to do it with a pivot table that combines all rows with the same title and summarizes the distance.

- again i'm assuming the distance is in a separate record/field and not part of the '123 main street'.

- even if the mileage *IS* in the same field as the road name, you can use some string concatenation/manipulation to filter out the numbers and place them into their own column so you can do math on it.

so again, not sure i'm answering your question but this seems very doable without too much effort.
This post was edited on 6/30 at 2:39 pm


Hu_Flung_Pu
USA Fan
Central, LA
Member since Jan 2013
19607 posts
 Online 

re: Need help with a database comparison
So I have both of them in a single workbook and two different worksheets in Excel using CSV.

If there is no difference or they don't exist in both, I have no need for them. I only need the ones that have a match in both and a discrepancy.


BobRoss
Fresno State Fan
Member since Jun 2014
1266 posts

re: Need help with a database comparison
Do you have SQL Server Studio Manager? I wouldn't work with that many rows in excel.


Replies (0)
Replies (0)
30
CAD703X
Toledo Fan
Liberty Island
Member since Jul 2008
65475 posts

re: Need help with a database comparison
can you take a screenshot at least? Need to know how the data looks or send me 20 rows or so and I can do the calculation from that then you can just paste the formula.

Step 1 is to merge the rows with the same names and summarize the distance into column B using a pivot table.

Step 2 would be to copy the pivot table results from worksheet a and worksheet b into worksheet c and then do another pivot except this time put the difference into column b instead of the sum


td1
Baton Rouge
Member since Oct 2015
1005 posts

re: Need help with a database comparison
What is the use case? Is this to look up each one and compare when needed or to have them all in one place already combined and compared?

If the latter, this is something we would script and dump the values in a new table.

New table fields (Based on only two distances in each table)
Uniqueid
Street_name
Tbla_dist1
Tbla_dist2
Tblb_dist1
Tblb_dist2
Dist_diff

Get all unique address values, get distances from each unique value that has more than one record, dump those into the new table in their corresponding fields, do the same for the other table. Run back through the new table to calculate and update the difference field, but we mostly do those calculations programmatically vs storing them in a table. Such as you run a report and it grabs the values and then calculates the difference, way better if those distance values can change over time.

You could also normalize the new table from the beginning if you are going to have different numbers of distances for each address and not just two. It would be the best way long term for queries and space savings, but quick and dirty would be to just go with that one new table.








This post was edited on 6/30 at 5:38 pm


Hulkklogan
LSU Fan
Baton Rouge, LA
Member since Oct 2010
39748 posts

re: Need help with a database comparison
Yeah sounds like a job for Python.


TD SponsorTD Fan
USA
Member since 2001
Thank you for supporting our sponsors
Advertisement
td1
Baton Rouge
Member since Oct 2015
1005 posts

re: Need help with a database comparison
Regardless of how you do it, expect it to take a little while with 600k records.

We are a php and asp shop, but python would certainly do it.

This post was edited on 6/30 at 5:56 pm


Replies (0)
Replies (0)
00
Hu_Flung_Pu
USA Fan
Central, LA
Member since Jan 2013
19607 posts
 Online 

re: Need help with a database comparison
quote:

Step 1 is to merge the rows with the same names and summarize the distance into column B using a pivot table.

Step 2 would be to copy the pivot table results from worksheet a and worksheet b into worksheet c and then do another pivot except this time put the difference into column b instead of the sum


Did a pivot on one then did a vlookup. Did the first few records and it did well. Going to just subtract the vlookup results to the original and see any difference.


BoogaBear
LSU Fan
Member since Jul 2013
3515 posts
 Online 

re: Need help with a database comparison
quote:

Example
Database A
1300 Main St / .5 miles
1300 Main St / .6 miles

Need 1300 Main St / 1.1 miles

Database B
1300 Main St / .5 miles
1300 Main St / .5 miles


Get an open source ETL tool like Pentaho or Talend. In a single transformation you can read records from both A and B, clean up the records, then compare the distances.

You can do whatever you want with them after that, filter differences out to a table, excel file, etc.


CAD703X
Toledo Fan
Liberty Island
Member since Jul 2008
65475 posts

re: Need help with a database comparison
how did this work out? excel is going to always be way easier than learning how to write the correct SQL for someone not used to that even if SQL is much more powerful.


messyjesse
Member since Nov 2015
1383 posts
 Online 

re: Need help with a database comparison
quote:

Get an open source ETL tool like Pentaho or Talend. In a single transformation you can read records from both A and B, clean up the records, then compare the distances.

You can do whatever you want with them after that, filter differences out to a table, excel file, etc.


Hell even that sounds like overkill if the data's already in a spreadsheet/csv. Just use Excel's built-in Power Query.


Replies (0)
Replies (0)
00
Hu_Flung_Pu
USA Fan
Central, LA
Member since Jan 2013
19607 posts
 Online 

re: Need help with a database comparison
I haven't finished lol. 16GB of RAM ain't enough. I even brought it home to my computer which has an 8700k at 5Ghz and fast 16GB RAM and it crashed.


Mingo Was His NameO
LSU Fan
Brooklyn
Member since Mar 2016
13718 posts

re: Need help with a database comparison
Alteryx could do this in 5 seconds.

Group by address, sum distance then join by address and compare distance values. Would be done in less than 15 seconds
This post was edited on 7/2 at 10:48 am


Replies (0)
Replies (0)
01
td1
Baton Rouge
Member since Oct 2015
1005 posts

re: Need help with a database comparison
Bc Excel wasn’t designed to handle hundreds of thousands of “records”.


GetMeOutOfHere
Member since Aug 2018
40 posts
 Online 

re: Need help with a database comparison
quote:

Bc Excel wasn’t designed to handle hundreds of thousands of “records”.


If all you have is a hammer, everything looks like a nail.


Replies (0)
Replies (0)
00
first pageprev pagePage 1 of 1next pagelast page
refresh

Back to top

logoFollow TigerDroppings for LSU Football News
Follow us on Twitter, Facebook and Instagram to get the latest updates on LSU Football and Recruiting.

FacebookTwitterInstagram