{"id":720,"date":"2025-03-29T20:00:00","date_gmt":"2025-03-29T20:00:00","guid":{"rendered":"https:\/\/datablog.roman-halliday.com\/?p=720"},"modified":"2025-03-26T14:34:49","modified_gmt":"2025-03-26T14:34:49","slug":"sample-test-data-for-data-warehousing","status":"publish","type":"post","link":"https:\/\/datablog.roman-halliday.com\/index.php\/2025\/03\/29\/sample-test-data-for-data-warehousing\/","title":{"rendered":"Sample &amp; Test Data for Data Warehousing"},"content":{"rendered":"\n<p>For those of use who like to learn and experiment (and particularly if you want to teach\/provide examples), sample data is a must. Professionals have access to their real world data (which we can&#8217;t share in whole or in part) to learn with, but even then dummy data\/sample data can be useful as we can do what we like with it, whenever we like.<\/p>\n\n\n\n<p>In this post, I&#8217;m linking to and summarising some sample data sets which are shared with licenses allowing people to use and reuse them with a lot of (if not complete) freedom. Because they are open sourced (or with permissive licences), many have been ported to multiple database servers. If one hasn&#8217;t been ported tot he database that you desire, then AI (or my <a href=\"https:\/\/github.com\/d-roman-halliday\/db2seed\">db2seed<\/a> tool) can help you convert them.<\/p>\n\n\n\n<h1 class=\"wp-block-heading\">Databases By Creator<\/h1>\n\n\n\n<p>I&#8217;ll start with some common sample databases, generally produced by the training department of each DBMS (or in the case of dbt, the tool developers). <\/p>\n\n\n\n<h2 class=\"wp-block-heading\">DBT Labs<\/h2>\n\n\n\n<p>DBT labs has a number of sample projects, which are listed under their <a href=\"https:\/\/docs.getdbt.com\/faqs\/Project\/example-projects\">Are there any example dbt projects?<\/a> page. One key advantage of these projects is they support any database supported by dbt as the data is provided as <a href=\"https:\/\/docs.getdbt.com\/reference\/commands\/seed\">dbt seeds<\/a> (csv files), and loaded by dbt.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Jaffle Shop<\/h3>\n\n\n\n<p>This comes in two flavours:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>General <code>jaffle-shop<\/code>: <a href=\"https:\/\/github.com\/dbt-labs\/jaffle-shop\">https:\/\/github.com\/dbt-labs\/jaffle-shop<\/a><\/li>\n\n\n\n<li>DuckDB (run all in one locally) version <code>jaffle_shop_duckdb<\/code>: <a href=\"https:\/\/github.com\/dbt-labs\/jaffle_shop_duckdb\">https:\/\/github.com\/dbt-labs\/jaffle_shop_duckdb<\/a><\/li>\n<\/ul>\n\n\n\n<p>The <code>jaffle-shop<\/code> project has a sister project to generate more data <code>jaffle-shop-generator<\/code>: <a href=\"https:\/\/github.com\/dbt-labs\/jaffle-shop-generator\">https:\/\/github.com\/dbt-labs\/jaffle-shop-generator<\/a> <\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Oracle (MySQL)<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\">Employees Sample Database<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><a href=\"https:\/\/dev.mysql.com\/doc\/employee\/en\/\">https:\/\/dev.mysql.com\/doc\/employee\/en\/<\/a><\/li>\n<\/ul>\n\n\n\n<p>licensed under the Creative Commons Attribution-Share Alike 3.0 Unported License.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Sakila Sample Database<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><a href=\"https:\/\/dev.mysql.com\/doc\/sakila\/en\/\">https:\/\/dev.mysql.com\/doc\/sakila\/en\/<\/a><\/li>\n<\/ul>\n\n\n\n<p>licensed under the New BSD license, and has been <a href=\"https:\/\/github.com\/jOOQ\/sakila\">ported to other databases<\/a>. <\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Microsoft<\/h2>\n\n\n\n<p>It came as a surprise to me that Microsoft shares their sample databases under an MIT licence (Microsoft has been good at making samples available, but the MIT licence enables the databases and data within to be ported for other databases).<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><a href=\"https:\/\/github.com\/microsoft\/sql-server-samples\/tree\/master\/samples\/databases\">https:\/\/github.com\/microsoft\/sql-server-samples\/tree\/master\/samples\/databases<\/a><\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Northwind and pubs sample databases<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><a href=\"https:\/\/github.com\/microsoft\/sql-server-samples\/tree\/master\/samples\/databases\/northwind-pubs\">https:\/\/github.com\/microsoft\/sql-server-samples\/tree\/master\/samples\/databases\/northwind-pubs<\/a><\/li>\n<\/ul>\n\n\n\n<p>Two sample databases packaged together, which have been ported to other database engines including <a href=\"https:\/\/github.com\/pthom\/northwind_psql\">PostgreSQL<\/a>.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">AdventureWorks<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><a href=\"https:\/\/github.com\/microsoft\/sql-server-samples\/tree\/master\/samples\/databases\/adventure-works\">https:\/\/github.com\/microsoft\/sql-server-samples\/tree\/master\/samples\/databases\/adventure-works<\/a><\/li>\n<\/ul>\n\n\n\n<p>Also available for <a href=\"https:\/\/github.com\/lorint\/AdventureWorks-for-Postgres\">PostgreSQL<\/a>.<\/p>\n\n\n\n<h1 class=\"wp-block-heading\">Other Sample Data Locations<\/h1>\n\n\n\n<h2 class=\"wp-block-heading\">kaggle.com<\/h2>\n\n\n\n<ul class=\"wp-block-list\">\n<li><a href=\"https:\/\/www.kaggle.com\/datasets\">https:\/\/www.kaggle.com\/datasets<\/a><\/li>\n<\/ul>\n\n\n\n<p>Kaggle is a data science competition platform and online community for data scientists and machine learning practitioners under Google LLC. Less appropriate for a lot of Analytics Engineering purposes, but a good mention here.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">AI\/LLM Generated Data<\/h2>\n\n\n\n<p>My original <a href=\"https:\/\/github.com\/d-roman-halliday\/sql-and-data-notes-and-samples\/tree\/main\/sample_and_test_data\/simple_shop_model\">simple shop model<\/a> (used for very basic testing) was created by a request to ChatGPT, stating that I wanted a dataset consisting of people, products, shopping carts, shopping cart items. I&#8217;ve since ported the CSVs to SQL for:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><a href=\"https:\/\/github.com\/d-roman-halliday\/sql-and-data-notes-and-samples\/blob\/main\/oracle\/sample_and_test_data\/simple_shop_model_bulk_inserts.sql\">Oracle: Simple Shop Model<\/a><\/li>\n\n\n\n<li><a href=\"https:\/\/github.com\/d-roman-halliday\/sql-and-data-notes-and-samples\/blob\/main\/MySQL\/sample_and_test_data\/simple_shop_model.sql\">MySQL: Simple Shop Model<\/a><\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\">Python Generated Data<\/h2>\n\n\n\n<p>I took my idea for the &#8220;AI\/LLM Generated Data&#8221; and my Simple Shop Model, and asked ChatGPT:<\/p>\n\n\n\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p>Create python code which uses the <code>faker<\/code> library and the <code>sqlalchemy<\/code> library to generate sample data for a database which has tables for customers, products, shopping carts, shopping cart items, support agents, and support requests<\/p>\n<\/blockquote>\n\n\n\n<p>The sample application (mostly created with ChatGPT, and hand edited for some tweaks) can be viewed here: <a href=\"https:\/\/github.com\/d-roman-halliday\/simple_shop_data_generator\">https:\/\/github.com\/d-roman-halliday\/simple_shop_data_generator<\/a><\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Other Datasets<\/h2>\n\n\n\n<p>If none of the above fit your needs, then these links might be worth investigationg:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><a href=\"https:\/\/dumps.wikimedia.org\/\">https:\/\/dumps.wikimedia.org\/<\/a><\/li>\n\n\n\n<li><a href=\"https:\/\/en.wikiversity.org\/wiki\/Database_Examples\">https:\/\/en.wikiversity.org\/wiki\/Database_Examples<\/a><\/li>\n\n\n\n<li><a href=\"https:\/\/www.re3data.org\/\">https:\/\/www.re3data.org\/<\/a> \n<ul class=\"wp-block-list\">\n<li><a href=\"https:\/\/www.re3data.org\/repository\/r3d100010960\">https:\/\/www.re3data.org\/repository\/r3d100010960<\/a><\/li>\n<\/ul>\n<\/li>\n<\/ul>\n\n\n\n<h1 class=\"wp-block-heading\">Loading Data<\/h1>\n\n\n\n<p>In most cases, sample data is produced in a format for loading into the respective database. Many of the samples above for database specific sample datasets have been ported for other databases. <\/p>\n\n\n\n<p>If a dataset hasn&#8217;t been migrated, then an easy way to do it (raw data, not including procedures and other database logic) could be to use my python project <a href=\"https:\/\/github.com\/d-roman-halliday\/db2seed\">db2seed<\/a>. Which is a simple script that:<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Reads database connection details from a YAML config file.<\/li>\n\n\n\n<li>Connects to the database using sqlalchemy.<\/li>\n\n\n\n<li>Lists tables in the database.<\/li>\n\n\n\n<li>Enables selection of individual tables.<\/li>\n\n\n\n<li>Exports selected tables to CSV.<\/li>\n\n\n\n<li>Generates a dbt seed configuration YAML file to define data types.<\/li>\n<\/ol>\n\n\n\n<h1 class=\"wp-block-heading\">Summary<\/h1>\n\n\n\n<p>Hopefully the links in this post are useful to Analytics Engineers looking for sample data for their own projects\/learning. As well as approaches for working with the data.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>For those of use who like to learn and experiment, sample data is a must.<br \/>\nIn this post, I&#8217;m linking to and summarising some sample data sets which are shared with licenses allowing people to use and reuse them with a lot of freedom.<\/p>\n","protected":false},"author":1,"featured_media":739,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[3,14],"tags":[52,62,34,7],"class_list":["post-720","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-databases","category-sample-data","tag-database","tag-dbt","tag-python","tag-sql"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.4 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>Sample &amp; Test Data for Data Warehousing - Rows Across The Lake<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/datablog.roman-halliday.com\/index.php\/2025\/03\/29\/sample-test-data-for-data-warehousing\/\" \/>\n<meta property=\"og:locale\" content=\"en_GB\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Sample &amp; Test Data for Data Warehousing - Rows Across The Lake\" \/>\n<meta property=\"og:description\" content=\"For those of use who like to learn and experiment, sample data is a must. In this post, I&#039;m linking to and summarising some sample data sets which are shared with licenses allowing people to use and reuse them with a lot of freedom.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/datablog.roman-halliday.com\/index.php\/2025\/03\/29\/sample-test-data-for-data-warehousing\/\" \/>\n<meta property=\"og:site_name\" content=\"Rows Across The Lake\" \/>\n<meta property=\"article:published_time\" content=\"2025-03-29T20:00:00+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/datablog.roman-halliday.com\/wp-content\/uploads\/2025\/03\/Gemini_Generated_Image_rnmtifrnmtifrnmt.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"2048\" \/>\n\t<meta property=\"og:image:height\" content=\"2048\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"author\" content=\"david\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@d_roman_h\" \/>\n<meta name=\"twitter:site\" content=\"@d_roman_h\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"david\" \/>\n\t<meta name=\"twitter:label2\" content=\"Estimated reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"3 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\\\/\\\/datablog.roman-halliday.com\\\/index.php\\\/2025\\\/03\\\/29\\\/sample-test-data-for-data-warehousing\\\/#article\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/datablog.roman-halliday.com\\\/index.php\\\/2025\\\/03\\\/29\\\/sample-test-data-for-data-warehousing\\\/\"},\"author\":{\"name\":\"david\",\"@id\":\"https:\\\/\\\/datablog.roman-halliday.com\\\/#\\\/schema\\\/person\\\/575f96d2590c3085923ff9e1b565748b\"},\"headline\":\"Sample &amp; Test Data for Data Warehousing\",\"datePublished\":\"2025-03-29T20:00:00+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\\\/\\\/datablog.roman-halliday.com\\\/index.php\\\/2025\\\/03\\\/29\\\/sample-test-data-for-data-warehousing\\\/\"},\"wordCount\":770,\"commentCount\":0,\"publisher\":{\"@id\":\"https:\\\/\\\/datablog.roman-halliday.com\\\/#\\\/schema\\\/person\\\/575f96d2590c3085923ff9e1b565748b\"},\"image\":{\"@id\":\"https:\\\/\\\/datablog.roman-halliday.com\\\/index.php\\\/2025\\\/03\\\/29\\\/sample-test-data-for-data-warehousing\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/datablog.roman-halliday.com\\\/wp-content\\\/uploads\\\/2025\\\/03\\\/Gemini_Generated_Image_rnmtifrnmtifrnmt.jpg\",\"keywords\":[\"database\",\"dbt\",\"python\",\"SQL\"],\"articleSection\":[\"Databases\",\"Sample Data\"],\"inLanguage\":\"en-GB\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\\\/\\\/datablog.roman-halliday.com\\\/index.php\\\/2025\\\/03\\\/29\\\/sample-test-data-for-data-warehousing\\\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/datablog.roman-halliday.com\\\/index.php\\\/2025\\\/03\\\/29\\\/sample-test-data-for-data-warehousing\\\/\",\"url\":\"https:\\\/\\\/datablog.roman-halliday.com\\\/index.php\\\/2025\\\/03\\\/29\\\/sample-test-data-for-data-warehousing\\\/\",\"name\":\"Sample &amp; Test Data for Data Warehousing - Rows Across The Lake\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/datablog.roman-halliday.com\\\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\\\/\\\/datablog.roman-halliday.com\\\/index.php\\\/2025\\\/03\\\/29\\\/sample-test-data-for-data-warehousing\\\/#primaryimage\"},\"image\":{\"@id\":\"https:\\\/\\\/datablog.roman-halliday.com\\\/index.php\\\/2025\\\/03\\\/29\\\/sample-test-data-for-data-warehousing\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/datablog.roman-halliday.com\\\/wp-content\\\/uploads\\\/2025\\\/03\\\/Gemini_Generated_Image_rnmtifrnmtifrnmt.jpg\",\"datePublished\":\"2025-03-29T20:00:00+00:00\",\"breadcrumb\":{\"@id\":\"https:\\\/\\\/datablog.roman-halliday.com\\\/index.php\\\/2025\\\/03\\\/29\\\/sample-test-data-for-data-warehousing\\\/#breadcrumb\"},\"inLanguage\":\"en-GB\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/datablog.roman-halliday.com\\\/index.php\\\/2025\\\/03\\\/29\\\/sample-test-data-for-data-warehousing\\\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-GB\",\"@id\":\"https:\\\/\\\/datablog.roman-halliday.com\\\/index.php\\\/2025\\\/03\\\/29\\\/sample-test-data-for-data-warehousing\\\/#primaryimage\",\"url\":\"https:\\\/\\\/datablog.roman-halliday.com\\\/wp-content\\\/uploads\\\/2025\\\/03\\\/Gemini_Generated_Image_rnmtifrnmtifrnmt.jpg\",\"contentUrl\":\"https:\\\/\\\/datablog.roman-halliday.com\\\/wp-content\\\/uploads\\\/2025\\\/03\\\/Gemini_Generated_Image_rnmtifrnmtifrnmt.jpg\",\"width\":2048,\"height\":2048},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/datablog.roman-halliday.com\\\/index.php\\\/2025\\\/03\\\/29\\\/sample-test-data-for-data-warehousing\\\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\\\/\\\/datablog.roman-halliday.com\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Sample &amp; Test Data for Data Warehousing\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/datablog.roman-halliday.com\\\/#website\",\"url\":\"https:\\\/\\\/datablog.roman-halliday.com\\\/\",\"name\":\"Rows Across The Lake\",\"description\":\"Data &amp; Databases\",\"publisher\":{\"@id\":\"https:\\\/\\\/datablog.roman-halliday.com\\\/#\\\/schema\\\/person\\\/575f96d2590c3085923ff9e1b565748b\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/datablog.roman-halliday.com\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-GB\"},{\"@type\":[\"Person\",\"Organization\"],\"@id\":\"https:\\\/\\\/datablog.roman-halliday.com\\\/#\\\/schema\\\/person\\\/575f96d2590c3085923ff9e1b565748b\",\"name\":\"david\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-GB\",\"@id\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/acddbc676a1d5c73795edcf0627ee39e5aa947da9033b58373e03d93122cb3b7?s=96&d=mm&r=g\",\"url\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/acddbc676a1d5c73795edcf0627ee39e5aa947da9033b58373e03d93122cb3b7?s=96&d=mm&r=g\",\"contentUrl\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/acddbc676a1d5c73795edcf0627ee39e5aa947da9033b58373e03d93122cb3b7?s=96&d=mm&r=g\",\"caption\":\"david\"},\"logo\":{\"@id\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/acddbc676a1d5c73795edcf0627ee39e5aa947da9033b58373e03d93122cb3b7?s=96&d=mm&r=g\"}}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Sample &amp; Test Data for Data Warehousing - Rows Across The Lake","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/datablog.roman-halliday.com\/index.php\/2025\/03\/29\/sample-test-data-for-data-warehousing\/","og_locale":"en_GB","og_type":"article","og_title":"Sample &amp; Test Data for Data Warehousing - Rows Across The Lake","og_description":"For those of use who like to learn and experiment, sample data is a must. In this post, I'm linking to and summarising some sample data sets which are shared with licenses allowing people to use and reuse them with a lot of freedom.","og_url":"https:\/\/datablog.roman-halliday.com\/index.php\/2025\/03\/29\/sample-test-data-for-data-warehousing\/","og_site_name":"Rows Across The Lake","article_published_time":"2025-03-29T20:00:00+00:00","og_image":[{"width":2048,"height":2048,"url":"https:\/\/datablog.roman-halliday.com\/wp-content\/uploads\/2025\/03\/Gemini_Generated_Image_rnmtifrnmtifrnmt.jpg","type":"image\/jpeg"}],"author":"david","twitter_card":"summary_large_image","twitter_creator":"@d_roman_h","twitter_site":"@d_roman_h","twitter_misc":{"Written by":"david","Estimated reading time":"3 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/datablog.roman-halliday.com\/index.php\/2025\/03\/29\/sample-test-data-for-data-warehousing\/#article","isPartOf":{"@id":"https:\/\/datablog.roman-halliday.com\/index.php\/2025\/03\/29\/sample-test-data-for-data-warehousing\/"},"author":{"name":"david","@id":"https:\/\/datablog.roman-halliday.com\/#\/schema\/person\/575f96d2590c3085923ff9e1b565748b"},"headline":"Sample &amp; Test Data for Data Warehousing","datePublished":"2025-03-29T20:00:00+00:00","mainEntityOfPage":{"@id":"https:\/\/datablog.roman-halliday.com\/index.php\/2025\/03\/29\/sample-test-data-for-data-warehousing\/"},"wordCount":770,"commentCount":0,"publisher":{"@id":"https:\/\/datablog.roman-halliday.com\/#\/schema\/person\/575f96d2590c3085923ff9e1b565748b"},"image":{"@id":"https:\/\/datablog.roman-halliday.com\/index.php\/2025\/03\/29\/sample-test-data-for-data-warehousing\/#primaryimage"},"thumbnailUrl":"https:\/\/datablog.roman-halliday.com\/wp-content\/uploads\/2025\/03\/Gemini_Generated_Image_rnmtifrnmtifrnmt.jpg","keywords":["database","dbt","python","SQL"],"articleSection":["Databases","Sample Data"],"inLanguage":"en-GB","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/datablog.roman-halliday.com\/index.php\/2025\/03\/29\/sample-test-data-for-data-warehousing\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/datablog.roman-halliday.com\/index.php\/2025\/03\/29\/sample-test-data-for-data-warehousing\/","url":"https:\/\/datablog.roman-halliday.com\/index.php\/2025\/03\/29\/sample-test-data-for-data-warehousing\/","name":"Sample &amp; Test Data for Data Warehousing - Rows Across The Lake","isPartOf":{"@id":"https:\/\/datablog.roman-halliday.com\/#website"},"primaryImageOfPage":{"@id":"https:\/\/datablog.roman-halliday.com\/index.php\/2025\/03\/29\/sample-test-data-for-data-warehousing\/#primaryimage"},"image":{"@id":"https:\/\/datablog.roman-halliday.com\/index.php\/2025\/03\/29\/sample-test-data-for-data-warehousing\/#primaryimage"},"thumbnailUrl":"https:\/\/datablog.roman-halliday.com\/wp-content\/uploads\/2025\/03\/Gemini_Generated_Image_rnmtifrnmtifrnmt.jpg","datePublished":"2025-03-29T20:00:00+00:00","breadcrumb":{"@id":"https:\/\/datablog.roman-halliday.com\/index.php\/2025\/03\/29\/sample-test-data-for-data-warehousing\/#breadcrumb"},"inLanguage":"en-GB","potentialAction":[{"@type":"ReadAction","target":["https:\/\/datablog.roman-halliday.com\/index.php\/2025\/03\/29\/sample-test-data-for-data-warehousing\/"]}]},{"@type":"ImageObject","inLanguage":"en-GB","@id":"https:\/\/datablog.roman-halliday.com\/index.php\/2025\/03\/29\/sample-test-data-for-data-warehousing\/#primaryimage","url":"https:\/\/datablog.roman-halliday.com\/wp-content\/uploads\/2025\/03\/Gemini_Generated_Image_rnmtifrnmtifrnmt.jpg","contentUrl":"https:\/\/datablog.roman-halliday.com\/wp-content\/uploads\/2025\/03\/Gemini_Generated_Image_rnmtifrnmtifrnmt.jpg","width":2048,"height":2048},{"@type":"BreadcrumbList","@id":"https:\/\/datablog.roman-halliday.com\/index.php\/2025\/03\/29\/sample-test-data-for-data-warehousing\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/datablog.roman-halliday.com\/"},{"@type":"ListItem","position":2,"name":"Sample &amp; Test Data for Data Warehousing"}]},{"@type":"WebSite","@id":"https:\/\/datablog.roman-halliday.com\/#website","url":"https:\/\/datablog.roman-halliday.com\/","name":"Rows Across The Lake","description":"Data &amp; Databases","publisher":{"@id":"https:\/\/datablog.roman-halliday.com\/#\/schema\/person\/575f96d2590c3085923ff9e1b565748b"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/datablog.roman-halliday.com\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-GB"},{"@type":["Person","Organization"],"@id":"https:\/\/datablog.roman-halliday.com\/#\/schema\/person\/575f96d2590c3085923ff9e1b565748b","name":"david","image":{"@type":"ImageObject","inLanguage":"en-GB","@id":"https:\/\/secure.gravatar.com\/avatar\/acddbc676a1d5c73795edcf0627ee39e5aa947da9033b58373e03d93122cb3b7?s=96&d=mm&r=g","url":"https:\/\/secure.gravatar.com\/avatar\/acddbc676a1d5c73795edcf0627ee39e5aa947da9033b58373e03d93122cb3b7?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/acddbc676a1d5c73795edcf0627ee39e5aa947da9033b58373e03d93122cb3b7?s=96&d=mm&r=g","caption":"david"},"logo":{"@id":"https:\/\/secure.gravatar.com\/avatar\/acddbc676a1d5c73795edcf0627ee39e5aa947da9033b58373e03d93122cb3b7?s=96&d=mm&r=g"}}]}},"_links":{"self":[{"href":"https:\/\/datablog.roman-halliday.com\/index.php\/wp-json\/wp\/v2\/posts\/720","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/datablog.roman-halliday.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/datablog.roman-halliday.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/datablog.roman-halliday.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/datablog.roman-halliday.com\/index.php\/wp-json\/wp\/v2\/comments?post=720"}],"version-history":[{"count":4,"href":"https:\/\/datablog.roman-halliday.com\/index.php\/wp-json\/wp\/v2\/posts\/720\/revisions"}],"predecessor-version":[{"id":740,"href":"https:\/\/datablog.roman-halliday.com\/index.php\/wp-json\/wp\/v2\/posts\/720\/revisions\/740"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/datablog.roman-halliday.com\/index.php\/wp-json\/wp\/v2\/media\/739"}],"wp:attachment":[{"href":"https:\/\/datablog.roman-halliday.com\/index.php\/wp-json\/wp\/v2\/media?parent=720"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/datablog.roman-halliday.com\/index.php\/wp-json\/wp\/v2\/categories?post=720"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/datablog.roman-halliday.com\/index.php\/wp-json\/wp\/v2\/tags?post=720"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}