SlideShare uma empresa Scribd logo
1 de 23
Baixar para ler offline
DocEng2013, September 10– 13, 2013, Florence, Italy

Splitting Wide Tables Optimally
Mihai Bilauca

Patrick Healy

Department of Computer Science and Information Systems
University of Limerick, Ireland

Supported by Science Foundation Ireland under the research programme 01/P1.2/C009,
Mathematical Foundations, Practical Notations, and Tools for Reliable Flexible Software.
Splitting Wide Tables Optimally
Why this paper?
• Tables are widely used for presenting logical
relationships between data items;
• Widely spread WYSIWYG tools have poor support for
wide tables;
• Authoring tables is hard, time consuming and error
prone;
• Style manuals recommendations are not always
supported
• Very little research in this area
Splitting Wide Tables Optimally

Slide 2 of 23
A wide table split across multiple pages
Splitting Wide Tables Optimally

Slide 3 of 23
+ Zoom in

Grouping of data items increases readability
Splitting Wide Tables Optimally

Slide 4 of 23
Splitting Wide Tables Optimally
Style recommendations from Chicago Manual of Style
“For a two-page broadside table – which should be presented
on facing pages if at all possible – column heads need not be
repeated; for broadside tables that run beyond two pages,
column heads are repeated only on each new verso.
Where column heads are repeated, the table number and
“continued” should also appear.
For any table that is likely to run to more than one page, the
editor should specify whether continued lines and repeated
column heads will be needed and where footnotes should
appear (usually at the end of the table as a whole).”

Splitting Wide Tables Optimally

Slide 5 of 23
Splitting Wide Tables Optimally
Overview
We present MIP Solutions using OPL for 3 problems that occur
when splitting wide tables with the aim to minimize the effect
on the meaning of data:
1. Minimize Page Count
2. Minimize Page Count and Column Positioning
Changes
3. Minimize Page Count and Group Splitting

Report experimental results with IBM CPLEX 12.3
Conclusions
MIP – Mixed Integer Programming
OPL – Optimization Programming Language
Splitting Wide Tables Optimally

Slide 6 of 23
1.Minimum Page Count

Splitting Wide Tables Optimally

Slide 7 of 23
1.Minimum Page Count – OPL Model
dvar int+ pageSel[Pages] in 0..1;
dvar int+ X[Pages][Cols] in 0..1;
dexpr int pageCount = sum(p in Pages) pageSel[p];
minimize pageCount;
subject to
{
ct1: // select only one page for each column
forall(j in Cols)
sum(p in Pages) X[p][j] == 1;
ct2: // only columns that fit in the page
forall(p in Pages)
sum(j in Cols)
colW[j] / pageW ∗ X[p][j] <= pageSel[p];
}
Splitting Wide Tables Optimally

Slide 8 of 23
1.Minimum Page Count - Results
●

Page count can be reduced by 14% to 25%

●

The difficulty of the problem is not directly linked to the
problem size but to the data itself

Columns

10

20

30

40

50

60

PC

7

16

19

29

34

48

OPC

6

12

15

23

26

39

%Imp

14.28%

25.00%

21.05%

20.68%

23.52%

18.75%

Time

2.25

0.13

0.17

1.18

04.30

1.52

Building Table Formatting Tools

Slide 9 of 23
2.Minimum Page Count & Column
Positioning Changes

Splitting Wide Tables Optimally

Slide 10 of 23
2.Minimum Page Count & Column Positioning Changes
PageW: 490 points
colW : [210, 140, 210, 420, 280, 350, 70, 140, 140, 350]
7 pages : {210,140} {210} {420} {280} {350,70} {140,140}
{350}
Minimum 5 pages:
ColIdx : [1, 7, 8, 5, 2, 9, 6, 10, 3, 4]
Pages:
{210,280} {140,350} {420,70} {140,210} {350,140}
Minimum 5 pages and column position changes possDiff
colIdx : [1, 2, 3, 5, 4, 7, 6, 8, 9, 10]
Pages : {210,140} {210,280} {420,70} {350,140} {140,350}

Splitting Wide Tables Optimally

Slide 11 of 23
2.Minimum Page Count & Column Positioning Changes
dvar int+ pageSel[Pages] in 0..1;
dvar int+ pageIdx[Cols] in 0..1;
dvar int+ colIdx[Cols] in 0..1;
// check if j1 is placed on a page before j2
dexpr int posO[j1,j2 in Cols] = j1 <= j2−1;
dexpr int posN[j1,j2 in Cols] = (colIdx[j1]<=colIdx[j2]−1)
dexpr float posDiff = sum(j1,j2 in Cols : j2 < j1)
abs(posO[j1,j2] − posN[j1,j2]);
dexpr int pageCount = sum(p in Pages) pageSel[p];
// a, b, obj1Val variables are used for OPL flow control
minimize a * pageCount + b * posDiff;

Splitting Wide Tables Optimally

Slide 12 of 23
2.Minimum Page Count & Column Positioning Changes
subject to {
ct1: // do not exceed page width
forall(p in Pages)
sum(j in Cols)
colW[j]/(p==pageIdx[j]) / pageW <= pageSel[p];
ct2: // page and column indexes relationship
forall(ordered j1,j2 in Cols)
(pageIdx[j1]<=pageIdx[j2]-1) (colIdx[j1]<=colIdx[j2]-1) == 0;
ct3: // unique column index values
forall(ordered j1,j2 in Cols)
colIdx[j1]!=colIdx[j2];
// if the minimum page count obj1Val is set
// maintain this value for subsequent searches
ct4:
if (obj1Val >= 0 ) pageCount == obj1Val;
}
Splitting Wide Tables Optimally

Slide 13 of 23
2.Minimum Page Count & Column Positioning Changes
Results
●

Promising performance:
– 2.25s for minimizing a 10 column table with posDiff
33 down to 4, page count from 9 down to 8;
– 89s for minimizing a 20 column table with posDiff
194 down to 4, page count from 13 down to 11;

●

Computational time increases with columns number

●

The data instance can have no better solutions

Building Table Formatting Tools

Slide 14 of 23
3.Minimum Page Count & Group Splitting

Splitting Wide Tables Optimally

Slide 15 of 23
3.Minimum Page Count & Group Splitting
User specifies which columns should preferably be
kept together
PageW: 490 points
colW : [210, 140, 210, 420, 280, 350, 70, 140, 140, 350]
7 pages: {210,140} {210} {420} {280} {350,70} {140,140}
{350}
Minimum 5 pages:
ColIdx:[3, 5, 4, 7, 10, 6, 8, 1, 2, 9]
Pages: {210,280} {420} {70,350} {350,140} {210,140,140}
Group columns 2,3 and 7:
colIdx:[2, 3, 7, 4, 9, 10, 6, 8, 1, 5]
Pages :{140,210,70} {420} {140,350} {350,140} {210,280}
Splitting Wide Tables Optimally

Slide 16 of 23
3.Minimum Page Count & Group Splitting
int colG[Cols] = ...;// column groups
dvar int+ pageSel[Pages] in 0..1;
dvar int+ pageIdx[Cols] in 0..1;
// find the first column of the group
int gFirstCol[g in groups] =
first({j | j in Cols : colG[j] == g});
// counts how many columns of a group are on a
// different page than the first group’s column
dexpr int gSplit[g in groups ] =
sum(j in Cols : colG[j] == g )
(pageIdx[j] != pageIdx[gFirstCol[g]]);
dexpr int gSplitCount = sum(g in groups)
(gSplit[g] >= 1 );
dexpr int pageCount = sum(p in Pages) pageSel[p];
Splitting Wide Tables Optimally

Slide 17 of 23
3.Minimum Page Count & Group Splitting
// a, b, obj1Val variables are used for OPL flow control
minimize a * pageCount + b * posDiff;
subject to {
ct1: // do not exceed page width
forall(p in Pages)
sum(j in Cols)
colW[j] * (p==pageIdx[j])/ pageW <= pageSel[p];
// if the minimum page count obj1Val is set
// maintain this value for subsequent searches
ct2:
if (obj1Val >= 0 ) pageCount == obj1Val;
}

Splitting Wide Tables Optimally

Slide 18 of 23
3.Minimum Page Count & Group Splitting Model
Results
●

●

Promising performance:
●
1m for a 20 column table with 3 groups, none
split, page count from 12 down to 9;
●
2m for 30-40 column tables but time increased
up to 12m when the number of groups
increased;
Computational time increases with columns and
groups number

●

Some relaxed solutions can be preffered

Building Table Formatting Tools

Slide 19 of 23
Conclusions

Splitting Wide Tables Optimally

Slide 20 of 23
Conclusions
•

•

•

Optimal arrangement of columns such that the
page count is minimized when splitting wide tables
can be achieved in relatively short running time; for
tables with 60 columns a solution has been found
in less than 2s;
If additional criteria are added, for example
minimizing the number of relative column positions
changes,the problems become harder as the
number of columns increase;
the difficulty of the problems not only depends on
the problem size but on the complexity of the data;

Splitting Wide Tables Optimally

Slide 21 of 23
Ongoing work
Minimizing the overall page count when a large table
containing text is displayed on fixed size pages and
neither column widths nor row heights are known in
advance.

Splitting Wide Tables Optimally

Slide 22 of 23
Thank you!

www.tabularlayout.org

Splitting Wide Tables Optimally

Slide 23 of 23

Mais conteúdo relacionado

Destaque

2013: This Year in Social by Anchor Media
2013: This Year in Social by Anchor Media2013: This Year in Social by Anchor Media
2013: This Year in Social by Anchor Medialbsilva26
 
Embedded programming in RTOS VxWorks for PROFIBUS VME interface card
Embedded programming in RTOS VxWorks for PROFIBUS VME interface cardEmbedded programming in RTOS VxWorks for PROFIBUS VME interface card
Embedded programming in RTOS VxWorks for PROFIBUS VME interface cardRinku Chandolia
 
Aprendizaje enseñanza y propuesta pedagógica
Aprendizaje enseñanza y propuesta pedagógicaAprendizaje enseñanza y propuesta pedagógica
Aprendizaje enseñanza y propuesta pedagógicaMineducyt El Salvador
 
5 relaciones comunidad, escuela, familia
5 relaciones comunidad, escuela, familia5 relaciones comunidad, escuela, familia
5 relaciones comunidad, escuela, familiaMineducyt El Salvador
 
Segundo parcial didáctica de la educación superior l sin contestar
Segundo  parcial didáctica de la educación superior l sin contestarSegundo  parcial didáctica de la educación superior l sin contestar
Segundo parcial didáctica de la educación superior l sin contestarMineducyt El Salvador
 
Segundo parcial didáctica de la educación superior l contestado
Segundo  parcial didáctica de la educación superior l  contestadoSegundo  parcial didáctica de la educación superior l  contestado
Segundo parcial didáctica de la educación superior l contestadoMineducyt El Salvador
 

Destaque (10)

The Software House
The Software House The Software House
The Software House
 
2013: This Year in Social by Anchor Media
2013: This Year in Social by Anchor Media2013: This Year in Social by Anchor Media
2013: This Year in Social by Anchor Media
 
Raghunath
RaghunathRaghunath
Raghunath
 
Embedded programming in RTOS VxWorks for PROFIBUS VME interface card
Embedded programming in RTOS VxWorks for PROFIBUS VME interface cardEmbedded programming in RTOS VxWorks for PROFIBUS VME interface card
Embedded programming in RTOS VxWorks for PROFIBUS VME interface card
 
Confidence = 7 Points of Entanglement 01
Confidence = 7 Points of Entanglement 01Confidence = 7 Points of Entanglement 01
Confidence = 7 Points of Entanglement 01
 
Aprendizaje enseñanza y propuesta pedagógica
Aprendizaje enseñanza y propuesta pedagógicaAprendizaje enseñanza y propuesta pedagógica
Aprendizaje enseñanza y propuesta pedagógica
 
Pp tla función productiva
Pp tla función productivaPp tla función productiva
Pp tla función productiva
 
5 relaciones comunidad, escuela, familia
5 relaciones comunidad, escuela, familia5 relaciones comunidad, escuela, familia
5 relaciones comunidad, escuela, familia
 
Segundo parcial didáctica de la educación superior l sin contestar
Segundo  parcial didáctica de la educación superior l sin contestarSegundo  parcial didáctica de la educación superior l sin contestar
Segundo parcial didáctica de la educación superior l sin contestar
 
Segundo parcial didáctica de la educación superior l contestado
Segundo  parcial didáctica de la educación superior l  contestadoSegundo  parcial didáctica de la educación superior l  contestado
Segundo parcial didáctica de la educación superior l contestado
 

Semelhante a Splitting Wide Tables Optimally Using MIP Models

You might be paying too much for BigQuery
You might be paying too much for BigQueryYou might be paying too much for BigQuery
You might be paying too much for BigQueryRyuji Tamagawa
 
Crosstalk
CrosstalkCrosstalk
Crosstalkcdhowe
 
Performance Tuning Oracle's BI Applications
Performance Tuning Oracle's BI ApplicationsPerformance Tuning Oracle's BI Applications
Performance Tuning Oracle's BI ApplicationsKPI Partners
 
June 05 P2
June 05 P2June 05 P2
June 05 P2Samimvez
 
Autocad commands for civil and mechanical Download
Autocad commands for civil and mechanical DownloadAutocad commands for civil and mechanical Download
Autocad commands for civil and mechanical Downloadcouponsavan
 
Auto cad shortcuts
Auto cad shortcutsAuto cad shortcuts
Auto cad shortcutsMintu Kumar
 
Auto cad shortcuts
Auto cad shortcutsAuto cad shortcuts
Auto cad shortcutsshah shreeji
 
AutoCAD Shortcut List
AutoCAD Shortcut ListAutoCAD Shortcut List
AutoCAD Shortcut ListAkash Patel
 
Auto cad command_shortcuts
Auto cad command_shortcutsAuto cad command_shortcuts
Auto cad command_shortcutsfajil Siddiki
 
mongoDB Project: Relational databases & Document-Oriented databases
mongoDB Project: Relational databases & Document-Oriented databasesmongoDB Project: Relational databases & Document-Oriented databases
mongoDB Project: Relational databases & Document-Oriented databasesStratos Gounidellis
 
MongoDB Project: Relational databases to Document-Oriented databases
MongoDB Project: Relational databases to Document-Oriented databasesMongoDB Project: Relational databases to Document-Oriented databases
MongoDB Project: Relational databases to Document-Oriented databasesLamprini Koutsokera
 
GRAPHICAL STRUCTURES in our lives
GRAPHICAL STRUCTURES in our livesGRAPHICAL STRUCTURES in our lives
GRAPHICAL STRUCTURES in our livesxryuseix
 
Sentimental analysis of financial articles using neural network
Sentimental analysis of financial articles using neural networkSentimental analysis of financial articles using neural network
Sentimental analysis of financial articles using neural networkBhavyateja Potineni
 
Auto cad commed
Auto cad commedAuto cad commed
Auto cad commed4CS Lahore
 
#include iostream#includectimeusing namespace std;void.docx
#include iostream#includectimeusing namespace std;void.docx#include iostream#includectimeusing namespace std;void.docx
#include iostream#includectimeusing namespace std;void.docxmayank272369
 

Semelhante a Splitting Wide Tables Optimally Using MIP Models (20)

You might be paying too much for BigQuery
You might be paying too much for BigQueryYou might be paying too much for BigQuery
You might be paying too much for BigQuery
 
Crosstalk
CrosstalkCrosstalk
Crosstalk
 
Less08 Schema
Less08 SchemaLess08 Schema
Less08 Schema
 
Performance Tuning Oracle's BI Applications
Performance Tuning Oracle's BI ApplicationsPerformance Tuning Oracle's BI Applications
Performance Tuning Oracle's BI Applications
 
Os8
Os8Os8
Os8
 
June 05 P2
June 05 P2June 05 P2
June 05 P2
 
Autocad commands for civil and mechanical Download
Autocad commands for civil and mechanical DownloadAutocad commands for civil and mechanical Download
Autocad commands for civil and mechanical Download
 
Auto cad shortcuts
Auto cad shortcutsAuto cad shortcuts
Auto cad shortcuts
 
Auto cad shortcuts
Auto cad shortcutsAuto cad shortcuts
Auto cad shortcuts
 
Auto cad shortcuts
Auto cad shortcutsAuto cad shortcuts
Auto cad shortcuts
 
AutoCAD Shortcut List
AutoCAD Shortcut ListAutoCAD Shortcut List
AutoCAD Shortcut List
 
Auto cad shortcuts
Auto cad shortcutsAuto cad shortcuts
Auto cad shortcuts
 
Auto cad command_shortcuts
Auto cad command_shortcutsAuto cad command_shortcuts
Auto cad command_shortcuts
 
Auto cad command_shortcuts
Auto cad command_shortcutsAuto cad command_shortcuts
Auto cad command_shortcuts
 
mongoDB Project: Relational databases & Document-Oriented databases
mongoDB Project: Relational databases & Document-Oriented databasesmongoDB Project: Relational databases & Document-Oriented databases
mongoDB Project: Relational databases & Document-Oriented databases
 
MongoDB Project: Relational databases to Document-Oriented databases
MongoDB Project: Relational databases to Document-Oriented databasesMongoDB Project: Relational databases to Document-Oriented databases
MongoDB Project: Relational databases to Document-Oriented databases
 
GRAPHICAL STRUCTURES in our lives
GRAPHICAL STRUCTURES in our livesGRAPHICAL STRUCTURES in our lives
GRAPHICAL STRUCTURES in our lives
 
Sentimental analysis of financial articles using neural network
Sentimental analysis of financial articles using neural networkSentimental analysis of financial articles using neural network
Sentimental analysis of financial articles using neural network
 
Auto cad commed
Auto cad commedAuto cad commed
Auto cad commed
 
#include iostream#includectimeusing namespace std;void.docx
#include iostream#includectimeusing namespace std;void.docx#include iostream#includectimeusing namespace std;void.docx
#include iostream#includectimeusing namespace std;void.docx
 

Último

Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 

Último (20)

E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 

Splitting Wide Tables Optimally Using MIP Models

  • 1. DocEng2013, September 10– 13, 2013, Florence, Italy Splitting Wide Tables Optimally Mihai Bilauca Patrick Healy Department of Computer Science and Information Systems University of Limerick, Ireland Supported by Science Foundation Ireland under the research programme 01/P1.2/C009, Mathematical Foundations, Practical Notations, and Tools for Reliable Flexible Software.
  • 2. Splitting Wide Tables Optimally Why this paper? • Tables are widely used for presenting logical relationships between data items; • Widely spread WYSIWYG tools have poor support for wide tables; • Authoring tables is hard, time consuming and error prone; • Style manuals recommendations are not always supported • Very little research in this area Splitting Wide Tables Optimally Slide 2 of 23
  • 3. A wide table split across multiple pages Splitting Wide Tables Optimally Slide 3 of 23
  • 4. + Zoom in Grouping of data items increases readability Splitting Wide Tables Optimally Slide 4 of 23
  • 5. Splitting Wide Tables Optimally Style recommendations from Chicago Manual of Style “For a two-page broadside table – which should be presented on facing pages if at all possible – column heads need not be repeated; for broadside tables that run beyond two pages, column heads are repeated only on each new verso. Where column heads are repeated, the table number and “continued” should also appear. For any table that is likely to run to more than one page, the editor should specify whether continued lines and repeated column heads will be needed and where footnotes should appear (usually at the end of the table as a whole).” Splitting Wide Tables Optimally Slide 5 of 23
  • 6. Splitting Wide Tables Optimally Overview We present MIP Solutions using OPL for 3 problems that occur when splitting wide tables with the aim to minimize the effect on the meaning of data: 1. Minimize Page Count 2. Minimize Page Count and Column Positioning Changes 3. Minimize Page Count and Group Splitting Report experimental results with IBM CPLEX 12.3 Conclusions MIP – Mixed Integer Programming OPL – Optimization Programming Language Splitting Wide Tables Optimally Slide 6 of 23
  • 7. 1.Minimum Page Count Splitting Wide Tables Optimally Slide 7 of 23
  • 8. 1.Minimum Page Count – OPL Model dvar int+ pageSel[Pages] in 0..1; dvar int+ X[Pages][Cols] in 0..1; dexpr int pageCount = sum(p in Pages) pageSel[p]; minimize pageCount; subject to { ct1: // select only one page for each column forall(j in Cols) sum(p in Pages) X[p][j] == 1; ct2: // only columns that fit in the page forall(p in Pages) sum(j in Cols) colW[j] / pageW ∗ X[p][j] <= pageSel[p]; } Splitting Wide Tables Optimally Slide 8 of 23
  • 9. 1.Minimum Page Count - Results ● Page count can be reduced by 14% to 25% ● The difficulty of the problem is not directly linked to the problem size but to the data itself Columns 10 20 30 40 50 60 PC 7 16 19 29 34 48 OPC 6 12 15 23 26 39 %Imp 14.28% 25.00% 21.05% 20.68% 23.52% 18.75% Time 2.25 0.13 0.17 1.18 04.30 1.52 Building Table Formatting Tools Slide 9 of 23
  • 10. 2.Minimum Page Count & Column Positioning Changes Splitting Wide Tables Optimally Slide 10 of 23
  • 11. 2.Minimum Page Count & Column Positioning Changes PageW: 490 points colW : [210, 140, 210, 420, 280, 350, 70, 140, 140, 350] 7 pages : {210,140} {210} {420} {280} {350,70} {140,140} {350} Minimum 5 pages: ColIdx : [1, 7, 8, 5, 2, 9, 6, 10, 3, 4] Pages: {210,280} {140,350} {420,70} {140,210} {350,140} Minimum 5 pages and column position changes possDiff colIdx : [1, 2, 3, 5, 4, 7, 6, 8, 9, 10] Pages : {210,140} {210,280} {420,70} {350,140} {140,350} Splitting Wide Tables Optimally Slide 11 of 23
  • 12. 2.Minimum Page Count & Column Positioning Changes dvar int+ pageSel[Pages] in 0..1; dvar int+ pageIdx[Cols] in 0..1; dvar int+ colIdx[Cols] in 0..1; // check if j1 is placed on a page before j2 dexpr int posO[j1,j2 in Cols] = j1 <= j2−1; dexpr int posN[j1,j2 in Cols] = (colIdx[j1]<=colIdx[j2]−1) dexpr float posDiff = sum(j1,j2 in Cols : j2 < j1) abs(posO[j1,j2] − posN[j1,j2]); dexpr int pageCount = sum(p in Pages) pageSel[p]; // a, b, obj1Val variables are used for OPL flow control minimize a * pageCount + b * posDiff; Splitting Wide Tables Optimally Slide 12 of 23
  • 13. 2.Minimum Page Count & Column Positioning Changes subject to { ct1: // do not exceed page width forall(p in Pages) sum(j in Cols) colW[j]/(p==pageIdx[j]) / pageW <= pageSel[p]; ct2: // page and column indexes relationship forall(ordered j1,j2 in Cols) (pageIdx[j1]<=pageIdx[j2]-1) (colIdx[j1]<=colIdx[j2]-1) == 0; ct3: // unique column index values forall(ordered j1,j2 in Cols) colIdx[j1]!=colIdx[j2]; // if the minimum page count obj1Val is set // maintain this value for subsequent searches ct4: if (obj1Val >= 0 ) pageCount == obj1Val; } Splitting Wide Tables Optimally Slide 13 of 23
  • 14. 2.Minimum Page Count & Column Positioning Changes Results ● Promising performance: – 2.25s for minimizing a 10 column table with posDiff 33 down to 4, page count from 9 down to 8; – 89s for minimizing a 20 column table with posDiff 194 down to 4, page count from 13 down to 11; ● Computational time increases with columns number ● The data instance can have no better solutions Building Table Formatting Tools Slide 14 of 23
  • 15. 3.Minimum Page Count & Group Splitting Splitting Wide Tables Optimally Slide 15 of 23
  • 16. 3.Minimum Page Count & Group Splitting User specifies which columns should preferably be kept together PageW: 490 points colW : [210, 140, 210, 420, 280, 350, 70, 140, 140, 350] 7 pages: {210,140} {210} {420} {280} {350,70} {140,140} {350} Minimum 5 pages: ColIdx:[3, 5, 4, 7, 10, 6, 8, 1, 2, 9] Pages: {210,280} {420} {70,350} {350,140} {210,140,140} Group columns 2,3 and 7: colIdx:[2, 3, 7, 4, 9, 10, 6, 8, 1, 5] Pages :{140,210,70} {420} {140,350} {350,140} {210,280} Splitting Wide Tables Optimally Slide 16 of 23
  • 17. 3.Minimum Page Count & Group Splitting int colG[Cols] = ...;// column groups dvar int+ pageSel[Pages] in 0..1; dvar int+ pageIdx[Cols] in 0..1; // find the first column of the group int gFirstCol[g in groups] = first({j | j in Cols : colG[j] == g}); // counts how many columns of a group are on a // different page than the first group’s column dexpr int gSplit[g in groups ] = sum(j in Cols : colG[j] == g ) (pageIdx[j] != pageIdx[gFirstCol[g]]); dexpr int gSplitCount = sum(g in groups) (gSplit[g] >= 1 ); dexpr int pageCount = sum(p in Pages) pageSel[p]; Splitting Wide Tables Optimally Slide 17 of 23
  • 18. 3.Minimum Page Count & Group Splitting // a, b, obj1Val variables are used for OPL flow control minimize a * pageCount + b * posDiff; subject to { ct1: // do not exceed page width forall(p in Pages) sum(j in Cols) colW[j] * (p==pageIdx[j])/ pageW <= pageSel[p]; // if the minimum page count obj1Val is set // maintain this value for subsequent searches ct2: if (obj1Val >= 0 ) pageCount == obj1Val; } Splitting Wide Tables Optimally Slide 18 of 23
  • 19. 3.Minimum Page Count & Group Splitting Model Results ● ● Promising performance: ● 1m for a 20 column table with 3 groups, none split, page count from 12 down to 9; ● 2m for 30-40 column tables but time increased up to 12m when the number of groups increased; Computational time increases with columns and groups number ● Some relaxed solutions can be preffered Building Table Formatting Tools Slide 19 of 23
  • 20. Conclusions Splitting Wide Tables Optimally Slide 20 of 23
  • 21. Conclusions • • • Optimal arrangement of columns such that the page count is minimized when splitting wide tables can be achieved in relatively short running time; for tables with 60 columns a solution has been found in less than 2s; If additional criteria are added, for example minimizing the number of relative column positions changes,the problems become harder as the number of columns increase; the difficulty of the problems not only depends on the problem size but on the complexity of the data; Splitting Wide Tables Optimally Slide 21 of 23
  • 22. Ongoing work Minimizing the overall page count when a large table containing text is displayed on fixed size pages and neither column widths nor row heights are known in advance. Splitting Wide Tables Optimally Slide 22 of 23
  • 23. Thank you! www.tabularlayout.org Splitting Wide Tables Optimally Slide 23 of 23